Search Results: "kov"

12 July 2023

Reproducible Builds: Reproducible Builds in June 2023

Welcome to the June 2023 report from the Reproducible Builds project In our reports, we outline the most important things that we have been up to over the past month. As always, if you are interested in contributing to the project, please visit our Contribute page on our website.

We are very happy to announce the upcoming Reproducible Builds Summit which set to take place from October 31st November 2nd 2023, in the vibrant city of Hamburg, Germany. Our summits are a unique gathering that brings together attendees from diverse projects, united by a shared vision of advancing the Reproducible Builds effort. During this enriching event, participants will have the opportunity to engage in discussions, establish connections and exchange ideas to drive progress in this vital field. Our aim is to create an inclusive space that fosters collaboration, innovation and problem-solving. We are thrilled to host the seventh edition of this exciting event, following the success of previous summits in various iconic locations around the world, including Venice, Marrakesh, Paris, Berlin and Athens. If you re interesting in joining us this year, please make sure to read the event page] which has more details about the event and location. (You may also be interested in attending PackagingCon 2023 held a few days before in Berlin.)

This month, Vagrant Cascadian will present at FOSSY 2023 on the topic of Breaking the Chains of Trusting Trust:

Corrupted build environments can deliver compromised cryptographically signed binaries. Several exploits in critical supply chains have been demonstrated in recent years, proving that this is not just theoretical. The most well secured build environments are still single points of failure when they fail. [ ] This talk will focus on the state of the art from several angles in related Free and Open Source Software projects, what works, current challenges and future plans for building trustworthy toolchains you do not need to trust.

Hosted by the Software Freedom Conservancy and taking place in Portland, Oregon, FOSSY aims to be a community-focused event: Whether you are a long time contributing member of a free software project, a recent graduate of a coding bootcamp or university, or just have an interest in the possibilities that free and open source software bring, FOSSY will have something for you . More information on the event is available on the FOSSY 2023 website, including the full programme schedule.

Marcel Fourn , Dominik Wermke, William Enck, Sascha Fahl and Yasemin Acar recently published an academic paper in the 44th IEEE Symposium on Security and Privacy titled It s like flossing your teeth: On the Importance and Challenges of Reproducible Builds for Software Supply Chain Security . The abstract reads as follows:

The 2020 Solarwinds attack was a tipping point that caused a heightened awareness about the security of the software supply chain and in particular the large amount of trust placed in build systems. Reproducible Builds (R-Bs) provide a strong foundation to build defenses for arbitrary attacks against build systems by ensuring that given the same source code, build environment, and build instructions, bitwise-identical artifacts are created.

However, in contrast to other papers that touch on some theoretical aspect of reproducible builds, the authors paper takes a different approach. Starting with the observation that much of the software industry believes R-Bs are too far out of reach for most projects and conjoining that with a goal of to help identify a path for R-Bs to become a commonplace property , the paper has a different methodology:

We conducted a series of 24 semi-structured expert interviews with participants from the Reproducible-Builds.org project, and iterated on our questions with the reproducible builds community. We identified a range of motivations that can encourage open source developers to strive for R-Bs, including indicators of quality, security benefits, and more efficient caching of artifacts. We identify experiences that help and hinder adoption, which heavily include communication with upstream projects. We conclude with recommendations on how to better integrate R-Bs with the efforts of the open source and free software community.

A PDF of the paper is now available, as is an entry on the CISPA Helmholtz Center for Information Security website and an entry under the TeamUSEC Human-Centered Security research group.

On our mailing list this month:

Martin Monperrus asked whether there are any project[s] where reproducibility is checked in a continuous integration pipeline? which received a number of replies from various projects.
Vagrant Cascadian mentioned that Packaging Con 2023 is being held in Berlin, the weekend before the Reproducible Builds summit later this year. In particular, Vagrant noticed that the Call for Proposals (CFP) closes at the end of July.
Larry Doolittle was searching Usenet archives and discovered a thread from December 1999 titled Time independent checksum(cksum) on comp.unix.programming. Larry notes that it starts with Jayan asking about comparing binaries that might have difference in their embedded timestamps (that is, perhaps, Foreshadowing diffoscope, amiright? ) and goes on to observe that:

The antagonist is David Schwartz, who correctly says There are dozens of complex reasons why what seems to be the same sequence of operations might produce different end results, but goes on to say I totally disagree with your general viewpoint that compilers must provide for reproducability [sic]. Dwight Tovey and I (Larry Doolittle) argue for reproducible builds. I assert Any program especially a mission-critical program like a compiler that cannot reproduce a result at will is broken. Also it s commonplace to take a binary from the net, and check to see if it was trojaned by attempting to recreate it from source.

Lastly, there were a few changes to our website this month too, including Bernhard M. Wiedemann adding a simplified Rust example to our documentation about the SOURCE_DATE_EPOCH environment variable [ ], Chris Lamb made it easier to parse our summit announcement at a glance [ ], Mattia Rizzolo added the summit announcement at a glance [ ] itself [ ][ ][ ] and Rahul Bajaj added a taxonomy of variations in build environments [ ].

Distribution work 27 reviews of Debian packages were added, 40 were updated and 8 were removed this month adding to our knowledge about identified issues. A new `randomness_in_documentation_generated_by_mkdocs` toolchain issue was added by Chris Lamb [ ], and the `deterministic` flag on the `paths_vary_due_to_usrmerge` issue as we are not currently testing `usrmerge` issues [ ] issues.
Roland Clobus posted his 18th update of the status of reproducible Debian ISO images on our mailing list. Roland reported that all major desktops build reproducibly with `bullseye`, `bookworm`, `trixie` and `sid` , but he also mentioned amongst many changes that not only are the `non-free` images being built (and are reproducible) but that the live images are generated officially by Debian itself. [ ]
Jan-Benedict Glaw noticed a problem when building NetBSD for the VAX architecture. Noting that Reproducible builds [are] probably not as reproducible as we thought , Jan-Benedict goes on to describe that when two builds from different source directories won t produce the same result and adds various notes about sub-optimal handling of the `CFLAGS` environment variable. [ ]
F-Droid added 21 new reproducible apps in June, resulting in a new record of 145 reproducible apps in total. [ ]. (This page now sports missing data for March May 2023.) F-Droid contributors also reported an issue with broken resources in APKs making some builds unreproducible. [ ]
Bernhard M. Wiedemann published another monthly report about reproducibility within openSUSE

Upstream patches

Bernhard M. Wiedemann:

`bcachefs` (sort find / filesys)

`build-compare` (reports files as identical)

`build-time` (toolchain date)

`cockpit` (merged, gzip mtime)

`gcc13` (gcc13 toolchain LTO parallelism)

`ghc-rpm-macros` (toolchain parallelism)

`golangcli-lint` (date)

`gutenprint` (date+time)

`mage` (date (golang))

`mumble` (filesys)

`pcr` (date)

`python-nss` (drop sphinx .doctrees)

`python310` (merged, bisected+backported)

`warpinator` (merged, date)

`xroachng` (date)

Chris Lamb:

#1037159 filed against `elinks`.

#1037205 filed against `multipath-tools`.

#1037216 filed against `mkdocstrings-python-handlers`.

#1038730 filed against `fribidi`.

#1038957 filed against `jtreg7`.

#1039932 filed against `python-bitstring` (forwarded upstream).

Vagrant Cascadian:

#1037235 filed against `gradle-kotlin-dsl`.

#1037240 filed against `libsdl-console`.

#1037267 filed against `kawari8`.

#1037269 filed against `freetds`.

#1037272 filed against `gbrowse`.

#1037276 filed against `bglibs`.

#1037277 filed against `advi`.

#1037278 filed against `afterstep`.

#1037280 filed against `simstring`.

#1037296 filed against `manderlbot`.

#1037298 filed against `erlang-proper`.

#1037300 filed against `comedilib`.

#1037309 filed against `libint`.

#1037427 filed against `newlib`.

#1038908 filed against `binutils-msp430`.

#1038970 & #1038971 filed against `c-munipack`.

#1039044 filed against `python-marshmallow-sqlalchemy`.

#1039457 filed against `mplayer`.

#1039493 & #1039500 filed against `menu`.

#1039506 filed against `mini-buildd`.

#1039610 filed against `pnetcdf`.

#1039617 & #1039618 filed against `liblopsub`.

#1039619 filed against `wcc`.

#1039623 filed against `shotcut`.

#1039624 filed against `icu`.

#1039741 filed against `libapache-poi-java`.

#1039808 filed against `atf`.

#1039865 filed against `valgrind`.

Testing framework The Reproducible Builds project operates a comprehensive testing framework (available at tests.reproducible-builds.org) in order to check packages and other artifacts for reproducibility. In June, a number of changes were made by Holger Levsen, including:

Additions to a (relatively) new Documented Jenkins Maintenance (djm) script to automatically shrink a cache & save a backup of old data [ ], automatically split out previous months data from logfiles into specially-named files [ ], prevent concurrent remote logfile fetches by using a lock file [ ] and to add/remove various debugging statements [ ].

Updates to the automated system health checks to, for example, to correctly detect new kernel warnings due to a wording change [ ] and to explicitly observe which old/unused kernels should be removed [ ]. This was related to an improvement so that various kernel issues on Ubuntu-based nodes are automatically fixed. [ ]

Holger and Vagrant Cascadian updated all thirty-five hosts running Debian on the `amd64`, `armhf`, and `i386` architectures to Debian bookworm, with the exception of the Jenkins host itself which will be upgraded after the release of Debian 12.1. In addition, Mattia Rizzolo updated the email configuration for the `@reproducible-builds.org` domain to correctly accept incoming mails from `jenkins.debian.net` [ ] as well as to set up DomainKeys Identified Mail (DKIM) signing [ ]. And working together with Holger, Mattia also updated the Jenkins configuration to start testing Debian trixie which resulted in stopped testing Debian buster. And, finally, Jan-Benedict Glaw contributed patches for improved NetBSD testing.

If you are interested in contributing to the Reproducible Builds project, please visit our Contribute page on our website. However, you can get in touch with us via:

IRC: `#reproducible-builds` on `irc.oftc.net`.

Mailing list: `rb-general@lists.reproducible-builds.org`

Twitter: @ReproBuilds

8 July 2023

Dirk Eddelbuettel: #40: Another r2u Example Making Colab Easier

Welcome to the 40th post in the $R^4 series. This one will just be a very brief illustration of r2u use in what might be an unexpected place: Google Colab. Colab has a strong bent towards Jupyter and Python but has been supporting R compute kernels for some time (by changing what they call the runtime ). And with a little exploration one can identify these are (currently, as of July 2023) running Ubuntu 20.04 aka focal . Which is of course one of two system supported by our lovely r2u project (with the other being the newer 22.04 aka jammy ). And I mostly tweeted / tooted about r2u since the its introduction in #37. And gave basically just a mention in passing in faster feedback post #38 as well as the faster feedback in ci post #39). So a brief recap may be in order. In essence, r2u makes all of CRAN available as full-fledged Ubuntu binaries with complete and full dependencies which are then installed directly and quickly via apt. Which, to top it of, are accessed directly from R via install.packages() so no special knowledge or sauce needed. We often summarize it as fast, easy, reliable: what is not to like . And, as we established in a few minutes of probing, it also works in the focal -based Colab session. The screen shot shows the basic step of fetching the setup script (for plain Ubuntu focal system) from r2u, making it executable and running it. Total time: 34 seconds. And after that we see the pure magic of install.packages("tidyverse") installing all of it in nine seconds. Additionally, we add the brms package in thirty-one seconds cia install.packages("brms"). Both load just fine and echo their current values.

The commands that are executed in that R session are just

download.file("https://github.com/eddelbuettel/r2u/raw/master/inst/scripts/add_cranapt_focal.sh",
              "add_cranapt_focal.sh")
Sys.chmod("add_cranapt_focal.sh", "0755")
system("./add_cranapt_focal.sh")
install.packages("tidyverse")
library(tidyverse)
install.packages("brms")
library(brms)

The timings are the Colab notebook are visible in the left margin. The lack of output makes debugging a little trickier so I still recommend to use r2u for first expploration via a Docker container as e.g. rocker/r2u:jammy. More information about r2u is at its site, and we answered some question in issues, and at stackoverflow. More questions are always welcome! If you like this or other open-source work I do, you can now sponsor me at GitHub.

This post by Dirk Eddelbuettel originated on his Thinking inside the box blog. Please report excessive re-aggregation in third-party for-profit settings.

7 July 2023

Dirk Eddelbuettel: Rcpp 1.0.11 on CRAN: Updates and Maintenance

The Rcpp Core Team is delighted to announce that the newest release 1.0.11 of the Rcpp package arrived on CRAN and in Debian earlier today. Windows and macOS builds should appear at CRAN in the next few days, as will builds in different Linux distribution and of course at r2u. The release was finalized three days ago, but given the widespread use and extended reverse dependencies at CRAN it usually takes a few days to be processed. This release continues with the six-months January-July cycle started with release 1.0.5 in July 2020. As a reminder, we do of course make interim snapshot dev or rc releases available via the Rcpp drat repo and strongly encourage their use and testing I run my systems with these versions which tend to work just as well, and are also fully tested against all reverse-dependencies. Rcpp has long established itself as the most popular way of enhancing R with C or C++ code. Right now, 2720 packages on CRAN depend on Rcpp for making analytical code go faster and further, along with 251 in BioConductor. On CRAN, 13.7% of all packages depend (directly) on Rcpp, and 59.6% of all compiled packages do. From the cloud mirror of CRAN (which is but a subset of all CRAN downloads), Rcpp has been downloaded 72.5 million times. The two published papers (also included in the package as preprint vignettes) have, respectively, 1678 (JSS, 2011) and 259 (TAS, 2018) citations, while the the book (Springer useR!, 2013) has another 588. This release is incremental as usual, generally preserving existing capabilities faithfully while smoothing our corners and / or extending slightly, sometimes in response to changing and tightened demands from CRAN or R standards. The full list below details all changes, their respective PRs and, if applicable, issue tickets. Big thanks from all of us to all contributors!

Changes in Rcpp version 1.0.11 (2023-07-03)

Changes in Rcpp API:

Rcpp:::CxxFlags() now quotes only non-standard include path on linux (Lukasz in #1243 closing #1242).

Two unit tests no longer accidentally bark on stdout (Dirk and I aki in #1245).

Compilation under C++20 using clang++ and its standard library is enabled (Dirk in #1248 closing #1244).

Use backticks in a generated .Call() statement in RcppExports.R (Dirk #1256 closing #1255).

Switch to system2() to capture standard error messages in error cases (I aki in #1259 and #1261 fixing #1257).

Changes in Rcpp Documentation:

The CITATION file format has been updated (Dirk in #1250 fixing #1249).

Changes in Rcpp Deployment:

A test for qnorm now uses the more accurate value from R 4.3.0 (Dirk in #1252 and #1260 fixing #1251).

Skip tests with path issues on Windows (I aki in #1258).

Container deployment in continuous integrations was improved. (I aki and Dirk in #1264, Dirk in #1269).

Several files receives minor edits to please R CMD check from r-devel (Dirk in #1267).

Thanks to my CRANberries, you can also look at a diff to the previous release. Questions, comments etc should go to the rcpp-devel mailing list off the R-Forge page. Bugs reports are welcome at the GitHub issue tracker as well (where one can also search among open or closed issues); questions are also welcome under rcpp tag at StackOverflow which also allows searching among the (currently) 2994 previous questions. If you like this or other open-source work I do, you can sponsor me at GitHub.

This post by Dirk Eddelbuettel originated on his Thinking inside the box blog. Please report excessive re-aggregation in third-party for-profit settings.

17 June 2023

John Goerzen: Using dar for Data Archiving

This is the third post in a series about data archiving to removable media (optical discs and hard drives). In the first, I explained the difference between backing up and archiving, established goals for the project, and said I d evaluate git-annex and dar. The second post evaluated git-annex, and now it s time to look at dar. The series will conclude with a post comparing git-annex with dar. What is dar? I could open with the same thing I did with git-annex, just changing the name of the program: [dar] is a fantastic and versatile program that does well, it s one of those things that can do so much that it s a bit hard to describe. It is, fundamentally, an archiver like tar or zip (makes one file representing a bunch of other files), but it goes far beyond that. dar s homepage lays out a comprehensive list of features, which I will try to summarize here.

Dar itself is both a library (with C++ and Python bindings) for interacting with data, and a CLI tool (dar itself).
Alongside this, there is an ecosystem of tools around dar, including GUIs for multiple platforms, backup scripts, and FUSE implementations.
Dar is like tar in that it can read and write files sequentially if desired. Dar archives can be streamed, just like tar archives. But dar takes it further; if you have dar_slave on the remote end, random access is possible over ssh (dramatically speeding up certain operations).
Dar is like zip in that a dar archive contains a central directory (called a catalog) which permits random access to the contents of an archive. In other words, you don t have to read an entire archive to extract just one file (assuming the archive is on disk or something that itself permits random access). Also, dar can compress each file individually, rather than the tar approach of compressing the archive as a whole. This increases archive performance (dar knows not to try to compress already-compressed data), boosts restore resilience (corruption of one part of an archive doesn t invalidate the entire rest of it), and boosts restore performance (permitting random access).
Dar can split an archive into multiple pieces called slices, and it can even split member files among the slices. The catalog contains information allowing you to know which slice(s) a given file is saved in.
The catalog can also be saved off in a file of its own (dar calls this an isolated catalog ). Isolated catalogs record just metadata about files archived.
dar_manager can assemble a database by reading archives or isolated catalogs, letting you know where files are stored and facilitating restores using the minimal number of discs.
Dar supports differential/incremental backups, which record changes since the last backup. These backups record not just additions, but also deletions. dar can optionally use rsync-style binary deltas to minimize the space needed to record changes. Dar does not suffer from GNU tar s data loss bug with incrementals.
Dar can slice and dice archives like Perl does strings. The usage notes page shows how you can merge archives, create decremental archives (where the full backup always reflects the current state of the system, and incrementals go backwards in time instead of forwards), etc. You can change the compression algorithm on an existing archive, re-slice it, etc.
Dar is extremely careful about preserving all metadata: hard links, sparse files, symlinks, timestamps (including subsecond resolution), EAs, POSIX ACLs, resource forks on Mac, detecting files being modified while being read, etc. It makes a nice way to copy directories, sort of similar to rsync -avxHAXS.

So to tie this together for this project, I will set up a 400MB slice size (to mimic what I did with git-annex), and see how dar saves the data and restores it. Isolated cataloges aren t strictly necessary for this, but by using them (and/or dar_manager), we can build up a database of files and locations and thus directly compare dar to git-annex location tracking. Walkthrough: Creating the first archive As with the git-annex walkthrough, I ll set some variables to make it easy to remember:

$SOURCEDIR is the directory being backed up
$DRIVE is the directory for backups to be stored in. Since dar can split by a specified size, I don t need to make separate filesystems to simulate the separate drive experience as I did with git-annex.
$CATDIR will hold isolated catalogs
$DARDB points to the dar_manager database

OK, we can run the backup immediately. No special setup is needed. dar supports both short-form (single-character) parameters and long-form ones. Since the parameters probably aren t familiar to everyone, I will use the long-form ones in these examples. Here s how we create our initial full backup. I ll explain the parameters below:



$ dar \

  --verbose \

  --create $DRIVE/bak1 \

  --on-fly-isolate $CATDIR/bak1 \

  --slice 400M \

  --min-digits 2 \

  --pause \

  --fs-root $SOURCEDIR

Let s look at each of these parameters:

verbose does what you expect
create selects the operation mode (like tar -c) and gives the archive basename
on-fly-isolate says to write an isolated catalog as well, right while making the archive. You can always create an isolated catalog later (which is fast, since it only needs to read the last bits of the last slice) but it s more convenient to do it now, so we do. We give the base name for the isolated catalog also.
slice 400M says to split the archive, and create slices 400MB each.
min-digits 2 pertains to naming files. Without it, dar would create files named bak1.dar.1, bak1.dar.2, bak1.dar.10, etc. dar works fine with this, but it can be annoying in ls. This is just convenience for humans.
pause tells dar to pause after writing each slice. This would let us swap drives, burn discs, etc. I do this for demonstration purposes only; it isn t strictly necessary in this situation. For a more powerful option, dar also supports execute, which can run commands after each slice.
fs-root gives the path to actually back up.

This same command could have been written with short options as:



$ dar -v -c $DRIVE/bak1 -@ $CATDIR/bak1 -s 400M -9 2 -p -R $SOURCEDIR

What does it look like while running? Here s an excerpt:



...

Adding file to archive: /acrypt/no-backup/jgoerzen/testdata/[redacted]

Finished writing to file 1, ready to continue ?  [return = YES   Esc = NO]

...

Writing down archive contents...

Closing the escape layer...

Writing down the first archive terminator...

Writing down archive trailer...

Writing down the second archive terminator...

Closing archive low layer...

Archive is closed.

 --------------------------------------------

 581 inode(s) saved

   including 0 hard link(s) treated

 0 inode(s) changed at the moment of the backup and could not be saved properly

 0 byte(s) have been wasted in the archive to resave changing files

 0 inode(s) with only metadata changed

 0 inode(s) not saved (no inode/file change)

 0 inode(s) failed to be saved (filesystem error)

 0 inode(s) ignored (excluded by filters)

 0 inode(s) recorded as deleted from reference backup

 --------------------------------------------

 Total number of inode(s) considered: 581

 --------------------------------------------

 EA saved for 0 inode(s)

 FSA saved for 581 inode(s)

 --------------------------------------------

Making room in memory (releasing memory used by archive of reference)...

Now performing on-fly isolation...

...

That was easy! Let s look at the contents of the backup directory:



$ ls -lh $DRIVE

total 3.7G

-rw-r--r-- 1 jgoerzen jgoerzen 400M Jun 16 19:27 bak1.01.dar

-rw-r--r-- 1 jgoerzen jgoerzen 400M Jun 16 19:27 bak1.02.dar

-rw-r--r-- 1 jgoerzen jgoerzen 400M Jun 16 19:27 bak1.03.dar

-rw-r--r-- 1 jgoerzen jgoerzen 400M Jun 16 19:27 bak1.04.dar

-rw-r--r-- 1 jgoerzen jgoerzen 400M Jun 16 19:28 bak1.05.dar

-rw-r--r-- 1 jgoerzen jgoerzen 400M Jun 16 19:28 bak1.06.dar

-rw-r--r-- 1 jgoerzen jgoerzen 400M Jun 16 19:28 bak1.07.dar

-rw-r--r-- 1 jgoerzen jgoerzen 400M Jun 16 19:28 bak1.08.dar

-rw-r--r-- 1 jgoerzen jgoerzen 400M Jun 16 19:29 bak1.09.dar

-rw-r--r-- 1 jgoerzen jgoerzen 156M Jun 16 19:33 bak1.10.dar

And the isolated catalog:



$ ls -lh $CATDIR

total 37K

-rw-r--r-- 1 jgoerzen jgoerzen 35K Jun 16 19:33 bak1.1.dar

The isolated catalog is stored compressed automatically. Well this was easy. With one command, we archived the entire data set, split into 400MB chunks, and wrote out the catalog data. Walkthrough: Inspecting the saved archive Can dar tell us which slice contains a given file? Sure:



$ dar --list $DRIVE/bak1 --list-format=slicing   less

Slice(s) [Data ][D][ EA  ][FSA][Compr][S] Permission  Filemane

--------+--------------------------------+----------+-----------------------------

...

1        [Saved][ ]       [-L-][   0%][X] -rwxr--r-- [redacted]

1-2      [Saved][ ]       [-L-][   0%][X] -rwxr--r-- [redacted]

2        [Saved][ ]       [-L-][   0%][X] -rwxr--r-- [redacted]

...

This illustrates the transition from slice 1 to slice 2. The first file was stored entirely in slice 1; the second stored partially in slice 1 and partially in slice 2, and third solely in slice 2. We can get other kinds of information as well.



$ dar --list $DRIVE/bak1   less

[Data ][D][ EA  ][FSA][Compr][S]  Permission   User    Group   Size               Date                      filename

--------------------------------+------------+-------+-------+---------+-------------------------------+------------

[Saved][ ]       [-L-][   0%][X]  -rwxr--r--   jgoerzen jgoerzen        24 Mio  Mon Mar  5 07:58:09 2018        [redacted]

[Saved][ ]       [-L-][   0%][X]  -rwxr--r--   jgoerzen jgoerzen        16 Mio  Mon Mar  5 07:58:09 2018        [redacted]

[Saved][ ]       [-L-][   0%][X]  -rwxr--r--   jgoerzen jgoerzen        22 Mio  Mon Mar  5 07:58:09 2018        [redacted]

These are the same files I was looking at before. Here we see they are 24MB, 16MB, and 22MB in size, and some additional metadata. Even more is available in the XML list format. Walkthrough: updates As with git-annex, I ve made some changes in the source directory: moved a file, added another, and deleted one. Let s create an incremental backup now:



$ dar \

  --verbose \

  --create $DRIVE/bak2 \

  --on-fly-isolate $CATDIR/bak2 \

  --ref $CATDIR/bak1 \

  --slice 400M \

  --min-digits 2 \

  --pause \

  --fs-root $SOURCEDIR

This command is very similar to the earlier one. Instead of writing an archive and catalog named bak1, we write one named bak2. What s new here is --ref $CATDIR/bak1. That says, make an incremental based on an archive of reference. All that is needed from that archive of reference is the detached catalog. --ref $DRIVE/bak1 would have worked equally well here. Here s what I did to the $SOURCEDIR:

Renamed a file to file01-unchanged
Deleted a file
Copied /bin/cp to a file named cp

Let s see if dar s command output matches this:



...

Adding file to archive: /acrypt/no-backup/jgoerzen/testdata/file01-unchanged

Saving Filesystem Specific Attributes for /acrypt/no-backup/jgoerzen/testdata/file01-unchanged

Adding file to archive: /acrypt/no-backup/jgoerzen/testdata/cp

Saving Filesystem Specific Attributes for /acrypt/no-backup/jgoerzen/testdata/cp

Adding folder to archive: [redacted]

Saving Filesystem Specific Attributes for [redacted]

Adding reference to files that have been destroyed since reference backup...

...

 --------------------------------------------

 3 inode(s) saved

   including 0 hard link(s) treated

 0 inode(s) changed at the moment of the backup and could not be saved properly

 0 byte(s) have been wasted in the archive to resave changing files

 0 inode(s) with only metadata changed

 578 inode(s) not saved (no inode/file change)

 0 inode(s) failed to be saved (filesystem error)

 0 inode(s) ignored (excluded by filters)

 2 inode(s) recorded as deleted from reference backup

 --------------------------------------------

 Total number of inode(s) considered: 583

 --------------------------------------------

 EA saved for 0 inode(s)

 FSA saved for 3 inode(s)

 --------------------------------------------

...

Yes, it does. The rename is recorded as a deletion and an addition, since dar doesn t directly track renames. So the rename plus the deletion account for the two deletions. The rename plus the addition of cp count as 2 of the 3 inodes saved; the third is the modified directory from which files were deleted and moved out. Let s see the files that were created:



$ ls -lh $DRIVE/bak2*

-rw-r--r-- 1 jgoerzen jgoerzen 18M Jun 16 19:52 /acrypt/no-backup/jgoerzen/dar-testing/drive/bak2.01.dar

$ ls -lh $CATDIR/bak2*

-rw-r--r-- 1 jgoerzen jgoerzen 22K Jun 16 19:52 /acrypt/no-backup/jgoerzen/dar-testing/cat/bak2.1.dar

What does list look like now?



Slice(s) [Data ][D][ EA  ][FSA][Compr][S] Permission  Filemane

--------+--------------------------------+----------+-----------------------------

         [     ][ ]       [---][-----][X] -rwxr--r-- [redacted]

1        [Saved][ ]       [-L-][   0%][X] -rwxr--r-- file01-unchanged

...

         [--- REMOVED ENTRY ----][redacted]

         [--- REMOVED ENTRY ----][redacted]

Here I show an example of:

A file that was not changed from the initial backup. Its presence was simply noted, but because we re doing an incremental, the data wasn t saved.
A file that is saved in this incremental, on slice 1.
The two deleted files

Walkthrough: dar_manager As we ve seen above, the two archives (or their detached catalog) give us a complete picture of what files were present at the time of the creation of each archive, and what files were stored in a given archive. We can certainly continue working in that way. We can also use dar_manager to build a comprehensive database of these archives, to be able to find what media is necessary to restore each given file. Or, with dar_manager s when parameter, we can restore files as of a particular date. Let s try it out. First, we create our database:



$ dar_manager --create $DARDB

$ dar_manager --base $DARDB --add $DRIVE/bak1

Auto detecting min-digits to be 2

$ dar_manager --base $DARDB --add $DRIVE/bak2

Auto detecting min-digits to be 2

Here we created the database, and added our two catalogs to it. (Again, we could have as easily used $CATDIR/bak1; either the archive or its isolated catalog will work here.) It s important to add the catalogs in order. Let s do some quick experimentation with dar_manager:



$ dar_manager -v --base $DARDB --list

Decompressing and loading database to memory...


dar path         :

dar options      :

database version : 6

compression used : gzip

compression level: 9
archive #        path           basename

------------+--------------+---------------

        1       /acrypt/no-backup/jgoerzen/dar-testing/drive    bak1

        2       /acrypt/no-backup/jgoerzen/dar-testing/drive    bak2

$ dar_manager --base $DARDB --stat

  archive #      most recent/total data    most recent/total EA

--------------+-------------------------+-----------------------

             1 580/581                        0/0

             2 3/3                            0/0

The list option shows the correlation between dar_manager archive number (1, 2) with filenames (bak1, bak2). It is coincidence here that 1/bak1 and 2/bak2 correlate; that s not necessarily the case. Most dar_manager commands operate on archive number, while dar commands operate on archive path/basename. Now let s see just what files are saved in archive #2, the incremental:



$ dar_manager --base $DARDB --used 2

[ Saved ][       ]  [redacted]

[ Saved ][       ]  file01-unchanged

[ Saved ][       ]  cp

Now we can also where a file is stored. Here s one that was saved in the full backup and unmodified in the incremental:



$ dar_manager --base $DARDB --file [redacted]

        1       Fri Jun 16 19:15:12 2023  saved                                 absent

        2       Fri Jun 16 19:15:12 2023  present                               absent

(The absent at the end refers to extended attributes that the file didn t have) Similarly, for files that were added or removed, they ll be listed only at the appropriate place. Walkthrough: Restoration I m not going to repeat the author s full restoration with dar page, but here are some quick examples. A simple way of doing everything is using incrementals for the whole series. To do that, you d have bak1 be full, bak2 based on bak1, bak3 based on bak2, bak4 based on bak3, etc. To restore from such a series, you have two options:

Use dar to simply extract each archive in order. It will handle deletions, renames, etc. along the way.
Use dar_manager with the backup database to do manage the process. It may be somewhat more efficient, as it won t bother to restore files that will later be modified or deleted.

If you get fancy for instance, bak2 is based on bak1, bak3 on bak2, bak4 on bak1 then you would want to use dar_manager to ensure a consistent restore is completed. Either way, the process is nearly identical. Also, I figure, to make things easy, you can save a copy of the entire set of isolated catalogs before you finalize each disc/drive. They re so small, and this would let someone with just the most recent disc build a dar_manager database without having to go through all the other discs. Anyhow, let s do a restore using just dar. I ll make a $RESTOREDIR and do it that way.



$ dar \

  --verbose \

  --extract $DRIVE/bak1 \

  --fs-root $RESTOREDIR \

  --no-warn \

  --execute "echo Ready for slice %n.  Press Enter; read foo"

This execute lets us see how dar works; this is an illustration of the power it has (above pause); it s a snippet interpreted by /bin/sh with %n being one of the dar placeholders. If memory serves, it s not strictly necessary, as dar will prompt you for slices it needs if they re not mounted. Anyhow, you ll see it first reading the last slice, which contains the catalog, then reading from the beginning. Here we go:



Auto detecting min-digits to be 2

Opening archive bak1 ...

Opening the archive using the multi-slice abstraction layer...

Ready for slice 10. Press Enter

...

Loading catalogue into memory...

Locating archive contents...

Reading archive contents...

File ownership will not be restored du to the lack of privilege, you can disable this message by asking not to restore file ownership [return = YES   Esc = NO]

Continuing...

Restoring file's data: [redacted]

Restoring file's FSA: [redacted]

Ready for slice 1. Press Enter

...

Ready for slice 2. Press Enter

...

 --------------------------------------------

 581 inode(s) restored

    including 0 hard link(s)

 0 inode(s) not restored (not saved in archive)

 0 inode(s) not restored (overwriting policy decision)

 0 inode(s) ignored (excluded by filters)

 0 inode(s) failed to restore (filesystem error)

 0 inode(s) deleted

 --------------------------------------------

 Total number of inode(s) considered: 581

 --------------------------------------------

 EA restored for 0 inode(s)

 FSA restored for 0 inode(s)

 --------------------------------------------

The warning is because I m not doing the extraction as root, which limits dar s ability to fully restore ownership data. OK, now the incremental:



$ dar \

  --verbose \

  --extract $DRIVE/bak2 \

  --fs-root $RESTOREDIR \

  --no-warn \

  --execute "echo Ready for slice %n.  Press Enter; read foo"

...

Ready for slice 1. Press Enter

...

Restoring file's data: /acrypt/no-backup/jgoerzen/dar-testing/restore/file01-unchanged

Restoring file's FSA: /acrypt/no-backup/jgoerzen/dar-testing/restore/file01-unchanged

Restoring file's data: /acrypt/no-backup/jgoerzen/dar-testing/restore/cp

Restoring file's FSA: /acrypt/no-backup/jgoerzen/dar-testing/restore/cp

Restoring file's data: /acrypt/no-backup/jgoerzen/dar-testing/restore/[redacted directory]

Removing file (reason is file recorded as removed in archive): [redacted file]

Removing file (reason is file recorded as removed in archive): [redacted file]

This all looks right! Now how about we compare the restore to the original source directory?



$ diff -durN $SOURCEDIR $RESTOREDIR

No changes perfect. We could instead do this restore via a single dar_manager command, though annoyingly, we d have to pass all top-level files/directories to dar_manager restore. But still, it s one command, and basically automates and optimizes the dar restores shown above. Conclusions Dar makes it extremely easy to just Do The Right Thing when making archives. One command makes a backup. It saves things in simple files. You can make an isolated catalog if you want, and it too is saved in a simple file. You can query what is in the files and where. You can restore from all or part of the files. You can simply play the backups forward, in order, to achieve a full and consistent restore. Or you can load data about them into dar_manager for an optimized restore. A bit of scripting will be necessary to make incrementals; finding the most recent backup or catalog. If backup files are named with care for instance, by date then this should be a pretty easy task. I haven t touched on resiliency yet. dar comes with tools for recovering archives that have had portions corrupted or lost. It can also rebuild the catalog if it is corrupted or lost. It adds tape marks (or escape sequences ) to the archive along with the data stream. So every entry in the catalog is actually stored in the archive twice: once alongside the file data, and once at the end in the collected catalog. This allows dar to scan a corrupted file for the tape marks and reconstruct whatever is still intact, even if the catalog is lost. dar also integrates with tools like sha256sum and par2 to simplify archive integrity testing and restoration. This balances against the need to use a tool (dar, optionally with a GUI frontend) to restore files. I ll discuss that more in the next post.

22 January 2023

Dirk Eddelbuettel: Rcpp 1.0.10 on CRAN: Regular Update

The Rcpp team is thrilled to announce the newest release 1.0.10 of the Rcpp package which is hitting CRAN now and will go to Debian shortly. Windows and macOS builds should appear at CRAN in the next few days, as will builds in different Linux distribution and of course at r2u. The release was prepared a few days ago, but given the widespread use at CRAN it took a few days to be processed. As always, our sincere thanks to the CRAN maintainers Uwe Ligges and Kurt Hornik. This release continues with the six-months cycle started with release 1.0.5 in July 2020. As a reminder, we do of course make interim snapshot dev or rc releases available via the Rcpp drat repo and strongly encourage their use and testing I run my systems with these versions which tend to work just as well, and are also fully tested against all reverse-dependencies. Rcpp has become the most popular way of enhancing R with C or C++ code. Right now, around 2623 packages on CRAN depend on Rcpp for making analytical code go faster and further, along with 252 in BioConductor. On CRAN, 13.7% of all packages depend (directly) on CRAN, and 58.7% of all compiled packages do. From the cloud mirror of CRAN (which is but a subset of all CRAN downloads), Rcpp has been downloaded 67.1 million times. This release is incremental as usual, preserving existing capabilities faithfully while smoothing our corners and / or extending slightly. Of particular note is the now fully-enabled use of the unwind protection making some operations a little faster by default; special thanks to I aki for spearheading this. Kevin and I also polished a few other bugs off as detailed below. The full list of details follows.

Changes in Rcpp release version 1.0.10 (2023-01-12)

Changes in Rcpp API:

Unwind protection is enabled by default (I aki in #1225). It can be disabled by defining RCPP_NO_UNWIND_PROTECT before including Rcpp.h. RCPP_USE_UNWIND_PROTECT is not checked anymore and has no effect. The associated plugin unwindProtect is therefore deprecated and will be removed in a future release.

The 'finalize' method for Rcpp Modules is now eagerly materialized, fixing an issue where errors can occur when Module finalizers are run (Kevin in #1231 closing #1230).

Zero-row data.frame objects can receive push_back or push_front (Dirk in #1233 fixing #1232).

One remaining sprintf has been replaced by snprintf (Dirk and Kevin in #1236 and #1237).

Several conversion warnings found by clang++ have been addressed (Dirk in #1240 and #1241).

Changes in Rcpp Attributes:

The C++20, C++2b (experimental) and C++23 standards now have plugin support like the other C++ standards (Dirk in #1228).

The source path for attributes received one more protection from spaces (Dirk in #1235 addressing #1234).

Changes in Rcpp Deployment:

Several GitHub Actions have been updated.

Thanks to my CRANberries, you can also look at a diff to the previous release. Questions, comments etc should go to the rcpp-devel mailing list off the R-Forge page. Bugs reports are welcome at the GitHub issue tracker as well (where one can also search among open or closed issues); questions are also welcome under rcpp tag at StackOverflow which also allows searching among the (currently) 2932 previous questions. If you like this or other open-source work I do, you can sponsor me at GitHub.

This post by Dirk Eddelbuettel originated on his Thinking inside the box blog. Please report excessive re-aggregation in third-party for-profit settings.

27 December 2022

Chris Lamb: Favourite books of 2022: Fiction

This post marks the beginning my yearly roundups of the favourite books and movies that I read and watched in 2022 that I plan to publish over the next few days. Just as I did for 2020 and 2021, I won't reveal precisely how many books I read in the last year. I didn't get through as many books as I did in 2021, though, but that's partly due to reading a significant number of long nineteenth-century novels in particular, a fair number of those books that American writer Henry James once referred to as "large, loose, baggy monsters." However, in today's post I'll be looking at my favourite books that are typically filed under fiction, with 'classic' fiction following tomorrow. Works that just missed the cut here include John O'Brien's Leaving Las Vegas, Colson Whitehead's Sag Harbor and possibly The Name of the Rose by Umberto Eco, or Elif Batuman's The Idiot. I also feel obliged to mention (or is that show off?) that I also read the 1,079-page Infinite Jest by David Foster Wallace, but I can't say it was a favourite, let alone recommend others unless they are in the market for a good-quality under-monitor stand.

Mona (2021) Pola Oloixarac Mona is the story of a young woman who has just been nominated for the 'most important literary award in Europe'. Mona sees the nomination as a chance to escape her substance abuse on a Californian campus and so speedily decamps to the small village in the depths of Sweden where the nominees must convene for a week before the overall winner is announced. Mona didn't disappear merely to avoid pharmacological misadventures, though, but also to avoid the growing realisation that she is being treated as something of an anthropological curiosity at her university: a female writer of colour treasured for her flourish of exotic diversity that reflects well upon her department. But Mona is now stuck in the company of her literary competitors who all have now gathered from around the world in order to do what writers do: harbour private resentments, exchange empty flattery, embody the selfsame racialised stereotypes that Mona left the United States to avoid, stab rivals in the back, drink too much, and, of course, go to bed together. But as I read Mona, I slowly started to realise that something else is going on. Why does Mona keep finding traces of violence on her body, the origins of which she cannot or refuses to remember? There is something eerily defensive about her behaviour and sardonic demeanour in general as well. A genre-bending and mind-expanding novel unfolded itself, and, without getting into spoiler territory, Mona concludes with such a surprising ending that, according to Adam Thirlwell:

Perhaps we need to rethink what is meant by a gimmick. If a gimmick is anything that we want to reject as extra or excessive or ill-fitting, then it may be important to ask what inhibitions or arbitrary conventions have made it seem like excess, and to revel in the exorbitant fictional constructions it produces. [...]

Mona is a savage satire of the literary world, but it's also a very disturbing exploration of trauma and violence. The success of the book comes in equal measure from the author's commitment to both ideas, but also from the way the psychological damage component creeps up on you. And, as implied above, the last ten pages are quite literally out of this world.

My Brilliant Friend (2011)
The Story of a New Name (2012)
Those Who Leave and Those Who Stay (2013)
The Story of the Lost Child (2014) Elena Ferrante Elena Ferrante's Neopolitan Quartet follows two girls, both brilliant in their own way. Our protagonist-narrator is Elena, a studious girl from the lower rungs of the middle class of Naples who is inspired to be more by her childhood friend, Lila. Lila is, in turn, far more restricted by her poverty and class, but can transcend it at times through her fiery nature, which also brands her as somewhat unique within their inward-looking community. The four books follow the two girls from the perspective of Elena as they grow up together in post-war Italy, where they drift in-and-out of each other's lives due to the vicissitudes of change and the consequences of choice. All the time this is unfolding, however, the narrative is very always slightly charged by the background knowledge revealed on the very first page that Lila will, many years later, disappear from Elena's life. Whilst the quartet has the formal properties of a bildungsroman, its subject and conception are almost entirely different. In particular, the books are driven far more by character and incident than spectacular adventures in picturesque Italy. In fact, quite the opposite takes place: these are four books where ordinary-seeming occurrences take on an unexpected radiance against a background of poverty, ignorance, violence and other threats, often bringing to mind the films of the Italian neorealism movement. Brilliantly rendered from beginning to end, Ferrante has a seemingly studious eye for interpreting interactions and the psychology of adolescence and friendship. Some utterances indeed, perhaps even some glances are dissected at length over multiple pages, something that Vittorio De Sica's classic Bicycle Thieves (1948) could never do. Potential readers should not take any notice of the saccharine cover illustrations on most editions of the books. The quartet could even win an award for the most misleading artwork, potentially rivalling even Vladimir Nabokov's Lolita. I wouldn't be at all surprised if it is revealed that the drippy illustrations and syrupy blurbs ("a rich, intense and generous-hearted story ") turn out to be part of a larger metatextual game that Ferrante is playing with her readers. This idiosyncratic view of mine is partially supported by the fact that each of the four books has been given a misleading title, the true ambiguity of which often only becomes clear as each of the four books comes into sharper focus. Readers of the quartet often fall into debating which is the best of the four. I've heard from more than one reader that one has 'too much Italian politics' and another doesn't have enough 'classic' Lina moments. The first book then possesses the twin advantages of both establishing the environs and finishing with a breathtaking ending that is both satisfying and a cliffhanger as well but does this make it 'the best'? I prefer to liken the quartet more like the different seasons of The Wire (2002-2008) where, personal favourites and preferences aside, although each season is undoubtedly unique, it would take a certain kind of narrow-minded view of art to make the claim that, say, series one of The Wire is 'the best' or that the season that focuses on the Baltimore docks 'is boring'. Not to sound like a neo-Wagnerian, but each of them adds to final result in its own. That is to say, both The Wire and the Neopolitan Quartet achieve the rare feat of making the magisterial simultaneously intimate.

Out There: Stories (2022) Kate Folk Out There is a riveting collection of disturbing short stories by first-time author Kate Fork. The title story first appeared in the New Yorker in early 2020 imagines a near-future setting where a group of uncannily handsome artificial men called 'blots' have arrived on the San Francisco dating scene with the secret mission of sleeping with women, before stealing their personal data from their laptops and phones and then (quite literally) evaporating into thin air. Folk's satirical style is not at all didactic, so it rarely feels like she is making her points in a pedantic manner. But it's clear that the narrator of Out There is recounting her frustration with online dating. in a way that will resonate with anyone who s spent time with dating apps or indeed the contemporary hyper-centralised platform-based internet in general. Part social satire, part ghost story and part comic tales, the blurring of the lines between these factors is only one of the things that makes these stories so compelling. But whilst Folk constructs crazy scenarios and intentionally strange worlds, she also manages to also populate them with characters that feel real and genuinely sympathetic. Indeed, I challenge you not to feel some empathy for the 'blot' in the companion story Big Sur which concludes the collection, and it complicates any primary-coloured view of the dating world of consisting entirely of predatory men. And all of this is leavened with a few stories that are just plain surreal. I don't know what the deal is with Dating a Somnambulist (available online on Hobart Pulp), but I know that I like it.

Solaris (1961) Stanislaw Lem When Kelvin arrives at the planet Solaris to study the strange ocean that covers its surface, instead of finding an entirely physical scientific phenomenon, he soon discovers a previously unconscious memory embodied in the physical manifestation of a long-dead lover. The other scientists on the space station slowly reveal that they are also plagued with their own repressed corporeal memories. Many theories are put forward as to why all this is occuring, including the idea that Solaris is a massive brain that creates these incarnate memories. Yet if that is the case, the planet's purpose in doing so is entirely unknown, forcing the scientists to shift focus and wonder whether they can truly understand the universe without first understanding what lies within their own minds and in their desires. This would be an interesting outline for any good science fiction book, but one of the great strengths of Solaris is not only that it withholds from the reader why the planet is doing anything it does, but the book is so forcefully didactic in its dislike of the hubris, destructiveness and colonial thinking that can accompany scientific exploration. In one of its most vitriolic passages, Lem's own anger might be reaching out to the reader:

We are humanitarian and chivalrous; we don t want to enslave other races, we simply want to bequeath them our values and take over their heritage in exchange. We think of ourselves as the Knights of the Holy Contact. This is another lie. We are only seeking Man. We have no need of other worlds. We need mirrors. We don t know what to do with other worlds. A single world, our own, suffices us; but we can t accept it for what it is. We are searching for an ideal image of our own world: we go in quest of a planet, of a civilisation superior to our own, but developed on the basis of a prototype of our primaeval past. At the same time, there is something inside us that we don t like to face up to, from which we try to protect ourselves, but which nevertheless remains since we don t leave Earth in a state of primal innocence. We arrive here as we are in reality, and when the page is turned, and that reality is revealed to us that part of our reality that we would prefer to pass over in silence then we don t like it anymore.

An overwhelming preoccupation with this idea infuses Solaris, and it turns out to be a common theme in a lot of Lem's work of this period, such as in his 1959 'anti-police procedural' The Investigation. Perhaps it not a dislike of exploration in general or the modern scientific method in particular, but rather a savage critique of the arrogance and self-assuredness that accompanies most forms of scientific positivism, or at least pursuits that cloak themselves under the guise of being a laudatory 'scientific' pursuit:

Man has gone out to explore other worlds and other civilizations without having explored his own labyrinth of dark passages and secret chambers and without finding what lies behind doorways that he himself has sealed.

I doubt I need to cite specific instances of contemporary scientific pursuits that might meet Lem's punishing eye today, and the fact that his critique works both in 2022 and 1961 perhaps tells us more about the human condition than we'd care to know. Another striking thing about Solaris isn't just the specific Star Trek and Stargate SG-1 episodes that I retrospectively realised were purloined from the book, but that almost the entire register of Star Trek: The Next Generation in particular seems to be rehearsed here. That is to say, TNG presents itself as hard and fact-based 'sci-fi' on the surface, but, at its core, there are often human, existential and sometimes quite enormously emotionally devastating human themes being discussed such as memory, loss and grief. To take one example from many, the painful memories that the planet Solaris physically materialises in effect asks us to seriously consider what it actually is taking place when we 'love' another person: is it merely another 'mirror' of ourselves? (And, if that is the case, is that... bad?) It would be ahistorical to claim that all popular science fiction today can be found rehearsed in Solaris, but perhaps it isn't too much of a stretch:

[Solaris] renders unnecessary any more alien stories. Nothing further can be said on this topic ...] Possibly, it can be said that when one feels the urge for such a thing, one should simply reread Solaris and learn its lessons again. Kim Stanley Robinson [...]

I could go on praising this book for quite some time; perhaps by discussing the extreme framing devices used within the book at one point, the book diverges into a lengthy bibliography of fictional books-within-the-book, each encapsulating a different theory about what the mechanics and/or function of Solaris is, thereby demonstrating that 'Solaris studies' as it is called within the world of the book has been going on for years with no tangible results, which actually leads to extreme embarrassment and then a deliberate and willful blindness to the 'Solaris problem' on the part of the book's scientific community. But I'll leave it all here before this review gets too long... Highly recommended, and a likely reread in 2023.

Brokeback Mountain (1997) Annie Proulx Brokeback Mountain began as a short story by American author Annie Proulx which appeared in the New Yorker in 1997, although it is now more famous for the 2005 film adaptation directed by Taiwanese filmmaker Ang Lee. Both versions follow two young men who are hired for the summer to look after sheep at a range under the 'Brokeback' mountain in Wyoming. Unexpectedly, however, they form an intense emotional and sexual attachment, yet life intervenes and demands they part ways at the end of the summer. Over the next twenty years, though, as their individual lives play out with marriages, children and jobs, they continue reuniting for brief albeit secret liaisons on camping trips in remote settings. There's no feigned shyness or self-importance in Brokeback Mountain, just a close, compassionate and brutally honest observation of a doomed relationship and a bone-deep feeling for the hardscrabble life in the post-War West. To my mind, very few books have captured so acutely the desolation of a frustrated and repressed passion, as well as the particular flavour of undirected anger that can accompany this kind of yearning. That the original novella does all this in such a beautiful way (and without the crutch of the Wyoming landscape to look at ) is a tribute to Proulx's skills as a writer. Indeed, even without the devasting emotional undertones, Proulx's descriptions of the mountains and scree of the West is likely worth the read alone.

Luster (2020) Raven Leilani Edie is a young Black woman living in New York whose life seems to be spiralling out of control. She isn't good at making friends, her career is going nowhere, and she has no close family to speak of as well. She is, thus, your typical NYC millennial today, albeit seen through a lens of Blackness that complicates any reductive view of her privilege or minority status. A representative paragraph might communicate the simmering tone:

Before I start work, I browse through some photos of friends who are doing better than me, then an article on a black teenager who was killed on 115th for holding a weapon later identified as a showerhead, then an article on a black woman who was killed on the Grand Concourse for holding a weapon later identified as a cell phone, then I drown myself in the comments section and do some online shopping, by which I mean I put four dresses in my cart as a strictly theoretical exercise and then let the page expire.

She starts a sort-of affair with an older white man who has an affluent lifestyle in nearby New Jersey. Eric or so he claims has agreed upon an 'open relationship' with his wife, but Edie is far too inappropriate and disinhibited to respect any boundaries that Eric sets for her, and so Edie soon becomes deeply entangled in Eric's family life. It soon turns out that Eric and his wife have a twelve-year-old adopted daughter, Akila, who is also wait for it Black. Akila has been with Eric's family for two years now and they aren t exactly coping well together. They don t even know how to help her to manage her own hair, let alone deal with structural racism. Yet despite how dark the book's general demeanour is, there are faint glimmers of redemption here and there. Realistic almost to the end, Edie might finally realise what s important in her life, but it would be a stretch to say that she achieves them by the final page. Although the book is full of acerbic remarks on almost any topic (Dogs: "We made them needy and physically unfit. They used to be wolves, now they are pugs with asthma."), it is the comments on contemporary race relations that are most critically insightful. Indeed, unsentimental, incisive and funny, Luster had much of what I like in Colson Whitehead's books at times, but I can't remember a book so frantically fast-paced as this since the Booker-prize winning The Sellout by Paul Beatty or Sam Tallent's Running the Light.

9 October 2022

Jonathan Dowland: Focus writing with (despite) LaTeX

LaTeX the age-old typesetting system makes me angry. Not because it's bad. To clarify, not because there's something better. But because there should be. When writing a document using LaTeX, if you are prone to procrastination it can be very difficult to focus on the task at hand, because there are so many yaks to shave. Here's a few points of advice.

format the document source for legible reading. Yes, it's the input to the typesetter, and yes, the output of the typesetter needs to be legible. But it's worth making the input easy to read, too. Because
avoid rebuilding your rendered document too often. It's slow, it takes you out of the activity of writing, and it throws up lots of opportunities to get distracted by some rendering nit that you didn't realise would happen.
Unless you are very good at manoeuvring around long documents, liberally split them up. I think it's fine to have sections in their own source files.
Machine-assisted moving around documents is good. If you use (neo)vim, you can tweak exuberant-ctags to generate more useful tags for LaTeX documents than what you get OOTB, including jumping to \label s and the BibTeX source of \cite s. See this stackoverflow post.
If you use syntax highlighting in your editor, take a long, hard look at what it's drawing attention to. It's not your text, that's for sure. Is it worth having it on? Consider turning it off. Or (yak shaving beware!) tweak it to de-emphasise things, instead of emphasising them. One small example for (neo)vim, to change tokens recognised as being "todo" to match the styling used for comments (which is normally de-emphasised):
```
hi def link texTodo Comment
```

In a nutshell, I think it's wise to move much document reviewing work back into the editor rather than the rendered document, at least in the early stages of a section. And to do that, you need the document to be as legible as possible in the editor. The important stuff is the text you write, not the TeX macros you've sprinkled around to format it. A few tips I benefit from in terms of source formatting:

I stick a line of 78 '%' characters between each section and sub-section. This helps to visually break them up and finding them in a scroll-past is quicker.
I indent as much of the content as I can in each chapter/section/subsection (however deep I go in sections) to tie them to the section they belong to and see at a glance how deep I am in subsections, just like with source code. The exception is environments that I can't due to other tool limitations: I have code excerpts demarked by \begin code /\end code which are executed by Haskell's GHCi interpreter, and the indentation can interfere with Haskell's indentation rules.
For large documents (like a thesis), I have little helper "standalone" .tex files whose purpose is to let me build just one chapter or section at a time.
I'm fairly sure I'll settle on a serif font for my final document. But I have found that a sans-serif font is easier on my eyes on-screen. YMMV.

Of course, you need to review the rendered document too! I like to bounce that to a tablet with a pen/stylus/pencil and review it in a different environment to where I write. I then end up with a long list of scrawled notes, and a third distinct activity, back at the writing desk, is to systematically go through them and apply some GTD-style thinking to them: can I fix it in a few seconds? Do it straight away. Delegate it? Unlikely Defer it? transfer the review note into another system of record (such as LaTeX \\todo ). And finally

Don't stop to write a blog post about it.

19 September 2022

Antoine Beaupr : Looking at Wayland terminal emulators

Back in 2018, I made a two part series about terminal emulators that was actually pretty painful to write. So I'm not going to retry this here, not at all. Especially since I'm not submitting this to the excellent LWN editors so I can get away with not being very good at writing. Phew. Still, it seems my future self will thank me for collecting my thoughts on the terminal emulators I have found out about since I wrote that article. Back then, Wayland was not quite at the level where it is now, being the default in Fedora (2016), Debian (2019), RedHat (2019), and Ubuntu (2021). Also, a bunch of folks thought they would solve everything by using OpenGL for rendering. Let's see how things stack up.

Recap In the previous article, I touched on those projects:

Terminal Changes since review

Alacritty releases! scrollback, better latency, URL launcher, clipboard support, still not in Debian, but close

GNOME Terminal not much? couldn't find a changelog

Konsole outdated changelog, color, image previews, clickable files, multi-input, SSH plugin, sixel images

mlterm long changelog but: supports console mode (like GNU screen?!), Wayland support through libvte, sixel graphics, zmodem, mosh

pterm changes: Wayland support

st unparseable changelog, suggests scroll(1) or scrollback.patch for scrollback now

Terminator moved to GitHub, Python 3 support, not being dead

urxvt no significant changes, a single release, still in CVS!

Xfce Terminal hard to parse changelog, presumably some improvements to paste safety?

xterm notoriously hard to parse changelog, improvements to paste safety (`disallowedPasteControls`), fonts, clipboard improvements?

After writing those articles, bizarrely, I was still using rxvt even though it did not come up as shiny as I would have liked. The colors problems were especially irritating. I briefly played around with Konsole and xterm, and eventually switched to XTerm as my default `x-terminal-emulator` "alternative" in my Debian system, while writing this. I quickly noticed why I had stopped using it: clickable links are a huge limitation. I ended up adding keybindings to open URLs in a command. There's another keybinding to dump the history into a command. Neither are as satisfactory as just clicking a damn link.

Terminal	Changes since review
Alacritty	releases! scrollback, better latency, URL launcher, clipboard support, still not in Debian, but close
GNOME Terminal	not much? couldn't find a changelog
Konsole	outdated changelog, color, image previews, clickable files, multi-input, SSH plugin, sixel images
mlterm	long changelog but: supports console mode (like GNU screen?!), Wayland support through libvte, sixel graphics, zmodem, mosh
pterm	changes: Wayland support
st	unparseable changelog, suggests scroll(1) or scrollback.patch for scrollback now
Terminator	moved to GitHub, Python 3 support, not being dead
urxvt	no significant changes, a single release, still in CVS!
Xfce Terminal	hard to parse changelog, presumably some improvements to paste safety?
xterm	notoriously hard to parse changelog, improvements to paste safety (`disallowedPasteControls`), fonts, clipboard improvements?

Requirements Figuring out my requirements is actually a pretty hard thing to do. In my last reviews, I just tried a bunch of stuff and collected everything, but a lot of things (like tab support) I don't actually care about. So here's a set of things I actually do care about:

latency

resource usage

proper clipboard support, that is:

mouse selection and middle button uses PRIMARY

`control-shift-c` and `control-shift-v` for CLIPBOARD

true color support

no known security issues

active project

paste protection

clickable URLs

scrollback

font resize

non-destructive text-wrapping (ie. resizing a window doesn't drop scrollback history)

proper unicode support (at least latin-1, ideally "everything")

good emoji support (at least showing them, ideally "nicely"), which involves font fallback

Latency is particularly something I wonder about in Wayland. Kitty seem to have been pretty dilligent at doing latency tests, claiming 35ms with a hardware-based latency tester and 7ms with typometer, but it's unclear how those would come up in Wayland because, as far as I know, typometer does not support Wayland.

Candidates Those are the projects I am considering.

darktile - GPU rendering, Unicode support, themable, ligatures (optional), Sixel, window transparency, clickable URLs, true color support, not in Debian

foot - Wayland only, daemon-mode, sixel images, scrollback search, true color, font resize, URLs not clickable, but keyboard-driven selection, proper clipboard support, in Debian

havoc - minimal, scrollback, configurable keybindings, not in Debian

sakura - libvte, Wayland support, tabs, no menu bar, original libvte gangster, dynamic font size, probably supports Wayland, in Debian

termonad - Haskell? in Debian

wez - Rust, Wayland, multiplexer, ligatures, scrollback search, clipboard support, bracketed paste, panes, tabs, serial port support, Sixel, Kitty, iTerm graphics, built-in SSH client (!?), not in Debian

XTerm - status quo, no Wayland port obviously

zutty: OpenGL rendering, true color, clipboard support, small codebase, no Wayland support, crashes on bremner's, in Debian

Candidates not considered

Alacritty I would really, really like to use Alacritty, but it's still not packaged in Debian, and they haven't fully addressed the latency issues although, to be fair, maybe it's just an impossible task. Once it's packaged in Debian, maybe I'll reconsider.

Kitty Kitty is a "fast, feature-rich, GPU based", with ligatures, emojis, hyperlinks, pluggable, scriptable, tabs, layouts, history, file transfer over SSH, its own graphics system, and probably much more I'm forgetting. It's packaged in Debian. So I immediately got two people commenting (on IRC) that they use Kitty and are pretty happy with it. I've been hesitant in directly talking about Kitty publicly, but since it's likely there will be a pile-up of similar comments, I'll just say why it's not the first in my list, even if it might, considering it's packaged in Debian and otherwise checks all the boxes. I don't trust the Kitty code. Kitty was written by the same author as Calibre, which has a horrible security history and generally really messy source code. I have tried to do LTS work on Calibre, and have mostly given up on the idea of making that program secure in any way. See calibre for the details on that. Now it's possible Kitty is different: it's quite likely the author has gotten some experience writing (and maintaining for so long!) Calibre over the years. But I would be more optimistic if the author's reaction to the security issues were more open and proactive. I've also seen the same reaction play out on Kitty's side of things. As anyone who worked on writing or playing with non-XTerm terminal emulators, it's quite a struggle to make something (bug-for-bug) compatible with everything out there. And Kitty is in that uncomfortable place right now where it diverges from the canon and needs its own entry in the ncurses database. I don't remember the specifics, but the author also managed to get into fights with those people as well, which I don't feel is reassuring for the project going forward. If security and compatibility wasn't such big of a deal for me, I wouldn't mind so much, but I'll need a lot of convincing before I consider Kitty more seriously at this point.

Next steps It seems like Arch Linux defaults to foot in Sway, and I keep seeing it everywhere, so it is probably my next thing to try, if/when I switch to Wayland. One major problem with foot is that it's yet another terminfo entry. They did make it into ncurses (patch 2021-07-31) but only after Debian bullseye stable was released. So expect some weird compatibility issues when connecting to any other system that is older or the same as stable (!). One question mark with all Wayland terminals, and Foot in particular, is how much latency they introduce in the rendering pipeline. The foot performance and benchmarks look excellent, but do not include latency benchmarks.

No conclusion So I guess that's all I've got so far, I may try alacritty if it hits Debian, or foot if I switch to Wayland, but for now I'm hacking in xterm still. Happy to hear ideas in the comments. Stay tuned for more happy days.

9 July 2022

Dirk Eddelbuettel: Rcpp 1.0.9 on CRAN: Regular Updates

The Rcpp team is please to announce the newest release 1.0.9 of Rcpp which hit CRAN late yesterday, and has been uploaded to Debian as well. Windows and macOS builds should appear at CRAN in the next few days, as will builds in different Linux distribution and of course at r2u. The release was prepared om July 2, but it took a few days to clear a handful of spurious errors as false positives with CRAN this can when the set of reverse dependencies is so large, and the CRAN team remains busy. This release continues with the six-months cycle started with release 1.0.5 in July 2020. (This time, CRAN had asked for an interim release to silence a C++ warning; we then needed a quick follow-up to tweak tests.) As a reminder, interim dev or rc releases should generally be available in the Rcpp drat repo. These rolling release tend to work just as well, and are also fully tested against all reverse-dependencies. Rcpp has become the most popular way of enhancing R with C or C++ code. Right now, around 2559 packages on CRAN depend on Rcpp for making analytical code go faster and further, along with 252 in BioConductor. On CRAN, 13.9% of all packages depend (directly) on CRAN, and 58.5% of all compiled packages do. From the cloud mirror of CRAN (which is but a subset of all CRAN downloads), Rcpp has been downloaded 61.5 million times. This release is incremental and extends Rcpp with a number of small improvements all detailed in the NEWS file as well as below. We want to highlight the external contributions: a precious list tag is cleared on removal, and a move constructor and assignment for strings has been added (thanks to Dean Scarff), and (thanks to Bill Denney and Marco Colombo) two minor errors are corrected in the vignette documentation. A big Thank You! to everybody who contributed pull request, opened or answered issues, or questions at StackOverflow or on the mailing list. The full list of details follows.

Changes in Rcpp hotfix release version 1.0.9 (2022-07-02)

Changes in Rcpp API:

Accomodate C++98 compilation by adjusting attributes.cpp (Dirk in #1193 fixing #1192)

Accomodate newest compilers replacing deprecated std::unary_function and std::binary_function with std::function (Dirk in #1202 fixing #1201 and CRAN request)

Upon removal from precious list, the tag is set to null (I aki in #1205 fixing #1203)

Move constructor and assignment for strings have been added (Dean Scarff in #1219).

Changes in Rcpp Documentation:

Adjust one overflowing column (Bill Denney in #1196 fixing #1195)

Correct a typo in the FAQ (Marco Colombo in #1217)

Changes in Rcpp Deployment:

Accomodate four digit version numbers in unit test (Dirk)

Do not run complete test suite to limit test time to CRAN preference (Dirk in #1206)

Small updates to the CI test containers have been made

Some of changes also applied to an interim release 1.0.8.3 made for CRAN on 2022-03-14.

Thanks to my CRANberries, you can also look at a diff to the previous release. Questions, comments etc should go to the rcpp-devel mailing list off the R-Forge page. Bugs reports are welcome at the GitHub issue tracker as well (where one can also search among open or closed issues); questions are also welcome under rcpp tag at StackOverflow which also allows searching among the (currently) 2886 previous questions. If you like this or other open-source work I do, you can sponsor me at GitHub.

This post by Dirk Eddelbuettel originated on his Thinking inside the box blog. Please report excessive re-aggregation in third-party for-profit settings.

1 June 2022

Daniel Lange: Get Youtube Channel ID from username

Youtube has a really nice RSS feature that is extremely well hidden. If you postfix a Channel ID to

https://www.youtube.com/feeds/videos.xml?channel_id=<id goes here>

you get a really nice Atom 1.0 (~RSS) feed for your feedreader. Unfortunately the Channel ID is hard to find while you are navigating Youtube with usernames in the URL. E.g. https://www.youtube.com/c/TED is TED's channel, full of interesting and worth-to-watch content (and some assorted horse toppings, of course). But you have to read a lot of ugly HTML / JSON in that page to find and combine https://www.youtube.com/feeds/videos.xml?channel_id=UCAuUUnT6oDeKwE6v1NGQxug which is the related RSS feed. Jeff Keeling wrote a simple Youtube RSS Extractor that does well if you have a ../playlist?... or a .../channel/... URL but it will (currently) fail on user name channels or Youtube landing pages. So how do we get the Channel ID for a Youtube user we are interested to follow? Youtube has a great API but that is gated by API keys even for the most simple calls (that came only with v3 of the API but the previous version is depreciated since 2015)¹:

dl@laptop:~$ curl 'https://www.googleapis.com/youtube/v3/channels?part=contentDetails&forUsername=DebConfVideos'

"error":
"code": 403,
"message": "The request is missing a valid API key.",
"errors": [

"message": "The request is missing a valid API key.",
"domain": "global",
"reason": "forbidden"

],
"status": "PERMISSION_DENIED"

Luckily we can throw the same (example) user name DebConfVideos at curl and grep:

dl@laptop:~$ curl -s "https://www.youtube.com/c/DebConfVideos/videos" grep -Po '"channelId":".+?"'
"channelId":"UC7SbfAPZf8SMvAxp8t51qtQ"

So https://www.youtube.com/feeds/videos.xml?channel_id=UC7SbfAPZf8SMvAxp8t51qtQ is the RSS feed for DebConfVideos. We can use individual Youtube video URLs as well. With the hack above, it'll work to find us the Chanel ID from a Youtube video URL: Working around the Youtube API restrictions to still make use of their RSS feed

Working around the Youtube API restrictions to still make use of their RSS feed

Now, some user pages may have multiple valid RSS feeds because they contain multiple channels. Remember the TED page from above? Well run:

dl@laptop:~$ curl -s "https://www.youtube.com/c/TED" grep -Po '"channelId":".+?"' cut -d \" -f 4 while read -r YTID ; do echo -n "Youtube-ID: $YTID " ; curl -s "https://www.youtube.com/feeds/videos.xml?channel_id=$YTID" grep -m 1 -P -o "(?<=<title>).+(?=</title>)" ; done

This will iterate through the Channel IDs found and show you the titles. That way you can assess which one you want to add to your feedreader. screenshot of the above

You probably want the last Channel ID listed above, the non-selective "TED" one. And that's the one from the example above. Update 02.06.2022: smpl wrote in and has the much better solution for the most frequent use cases:

You can also use get a feed directly with a username:
https://www.youtube.com/feeds/videos.xml?user=<username>
The one I use most is the one for playlists (if creators remember to
use them).
https://www.youtube.com/feeds/videos.xml?playlist_id=<playlist id>

For the common case you don't even need the channel ID that way. But it is also conveniently given in a <yt:channelId> tag (or the topmost <id> tag) within the Atom XML document. Thanks, smpl!

Actually it is even more complicated as some channels, like our DebConfVideos example, will only get you an incomplete result, cf. this StackOverflow entry. I.e. the forUsername iterator may not even work and the "best practice" seems to be mucking around with the search call.

17 April 2022

Russ Allbery: First 2022 haul post

I haven't posted one of these in a while. Here's the (mostly new) stuff that's come out that caught my interest in the past few months. Some of these I've already read and reviewed. Tom Burgis Kleptopia (non-fiction)
Angela Chen Ace (non-fiction)
P. Dj l Clark A Dead Djinn in Cairo (sff)
P. Dj l Clark The Haunting of Tram Car 015 (sff)
P. Dj l Clark A Master of Djinn (sff)
Brittney C. Cooper Eloquent Rage (non-fiction)
Madeleine Dore I Didn't Do the Thing Today (non-fiction)
Saad Z. Hossain The Gurkha and the Lord of Tuesday (sff)
George F. Kennan Memoirs, 1925-1950 (non-fiction)
Kiese Laymon How to Slowly Kill Yourself and Others in America (non-fiction)
Adam Minter Secondhand (non-fiction)
Amanda Oliver Overdue (non-fiction)
Laurie Penny Sexual Revolution (non-fiction)
Scott A. Snook Friendly Fire (non-fiction)
Adrian Tchaikovsky Elder Race (sff)
Adrian Tchaikovsky Shards of Earth (sff)
Tor.com (ed.) Some of the Best of Tor.com: 2021 (sff anthology)
Charlie Warzel & Anne Helen Petersen Out of Office (non-fiction)
Robert Wears Still Not Safe (non-fiction)
Max Weber The Vocation Lectures (non-fiction) Lots and lots of non-fiction in this mix. Maybe a tiny bit better than normal at not buying tons of books that I don't have time to read, although my reading (and particularly my reviewing) rate has been a bit slow lately.

17 March 2022

Dirk Eddelbuettel: Rcpp 1.0.8.3: Hotfixing Hotfix

An even newer hot-fix release 1.0.8.3 of Rcpp follows the 1.0.8.2 release of a few days ago and got to CRAN this morning. A Debian upload will follow shortly, and Windows and macOS binaries will appear at CRAN in the next few days. This release again breaks with the six-months cycle started with release 1.0.5 in July 2020. When we addressed the CRAN request in 1.0.8.2 we forgot to dial testing down to their desired level (as three-part release numbers do automagically for us, whereas four-part do not). This is now taken care of, along with the hot-fix that was in 1.0.8.2 already. Rcpp has become the most popular way of enhancing R with C or C++ code. Right now, around 2522 packages on CRAN depend on Rcpp for making analytical code go faster and further, along with 239 in BioConductor. The full list of details for these two interim releases (and hence all changes accumulated since the last regular release, 1.0.8 in January) follows.

Changes in Rcpp hotfix release version 1.0.8.3 (2022-03-14)

Changes in Rcpp API:

Accomodate C++98 compilation by adjusting attributes.cpp (Dirk in #1193 fixing #1192)

Accomodate newest compilers replacing deprecated std::unary_function and std::binary_function with std::function (Dirk in #1202 fixing #1201 and CRAN request)

Changes in Rcpp Documentation:

Adjust one overflowing column (Bill Denney in #1196 fixing #1195)

Changes in Rcpp Deployment:

Accomodate four digit version numbers in unit test (Dirk)

Do not run complete test suite to limit test time to CRAN preference (Dirk)

Thanks to my CRANberries, you can also look at a diff to the previous release. Questions, comments etc should go to the rcpp-devel mailing list off the R-Forge page. Bugs reports are welcome at the GitHub issue tracker as well (where one can also search among open or closed issues); questions are also welcome under rcpp tag at StackOverflow which also allows searching among the (currently) 2843 previous questions. If you like this or other open-source work I do, you can sponsor me at GitHub.

This post by Dirk Eddelbuettel originated on his Thinking inside the box blog. Please report excessive re-aggregation in third-party for-profit settings.

12 March 2022

Dirk Eddelbuettel: Rcpp 1.0.8.2: Hotfix release per CRAN request

A new hot-fix release 1.0.8.2 of Rcpp just got to CRAN. It will also be uploaded to Debian shortly, and Windows and macOS binaries will appear at CRAN in the next few days. This release breaks with the six-months cycle started with release 1.0.5 in July 2020 as CRAN desired an update to silence nags from the newest clang version which turned a little loud over a feature deprecated in C++11 (namely std::unary_function() and std::binary_function()). This was easy to replace with std::function() which we did. The release also contains a minor bugfix relative to 1.0.8 and C++98 builds, and minor correction to one pdf vignette. The release was fully tested by us and CRAN as usual against all reverse dependencies. Rcpp has become the most popular way of enhancing R with C or C++ code. Right now, around 2519 packages on CRAN depend on Rcpp for making analytical code go faster and further, along with 239 in BioConductor. The full list of details for this interim release follows.

Changes in Rcpp hotfix release version 1.0.8.2 (2022-03-10)

Changes in Rcpp API:

Accomodate C++98 compilation by adjusting attributes.cpp (Dirk in #1193 fixing #1192)

Accomodate newest compilers replacing deprecated std::unary_function and std::binary_function with std::function (Dirk in #1202 fixing #1201 and CRAN request)

Changes in Rcpp Documentation:

Adjust one overflowing column (Bill Denney in #1196 fixing #1195)

Changes in Rcpp Deployment:

Accomodate four digit version numbers in unit test (Dirk)

Thanks to my CRANberries, you can also look at a diff to the previous release. Questions, comments etc should go to the rcpp-devel mailing list off the R-Forge page. Bugs reports are welcome at the GitHub issue tracker as well (where one can also search among open or closed issues); questions are also welcome under rcpp tag at StackOverflow which also allows searching among the (currently) 2843 previous questions. If you like this or other open-source work I do, you can sponsor me at GitHub.

This post by Dirk Eddelbuettel originated on his Thinking inside the box blog. Please report excessive re-aggregation in third-party for-profit settings.

7 March 2022

Ayoyimika Ajibade: Progress Report!! Modifying Expectations...

Wait! Just like yesterday when I was accepted as an Outreachy intern and the first half of the internship is finished . How time flies when you are having a good time As part of the requirements for the final application during the contribution period for the Outreachy internship, I needed to provide a timeline to achieve our goal on my outreachy task which is transitioning of dependencies in node16 and webpack5. Having consulted my mentors who implied that the packages depending on webpack and nodejs combined are so numerous that its impossible to finish all within a space of three months but we have steps to guide us through the entire process to achieve most of our goals which are

Find a list of packages(failing rebuild and testing with autopkgtest, of reverse dependencies of webpack and nodejs) to fix.
See if new upstream versions are available that support nodejs16 and webpack5 respectively.
See if the new upstream version works and doesn't fail while rebuilding or testing with autopkgtest.
Report bug in Debian if any fails to rebuild or test with autopkgtest.
Forward bugs upstream if needed.
Fix packages and forward patches.

As of this writing(though a little late ) we have successfully rebuilt all reverse dependencies of webpack5 and split them equally each for I and my co-intern for all Javascript modules as ruby packages also depend on webpack which is a total of 44 packages. Filed a bug report on Debian bug tracking system for failing packages, also the original maintainer or uploader of the package to the Debian archive mostly Debian developers also get a mail in references to the package bug report. Sometimes the uploader who also receives the bug report decides to help out to fix the package and forward the patch upstream if need be. We have also filed an issue to upstream repo mostly via github where some respond and create PR to solve those errors and others are plain aversive to the whole idea. PR from the upstream developer is cherry-picked and a patch is created by us to incorporate the code into our own working repository. some package upstream maintainer rejects such issues or doesn't respond, we take it upon ourselves to fix the package. The total number of packages that are successfully updated and ready to be merged is 10 packages while 12 packages remain on my own end to be updated. One of the most challenging packages to update so far was prop-type as its runs its large test suite using jest of a lower version 19.0.2 compared to that of Debian OS which is version 27.5.1 updating and migrating its API's and methods to use the Debian updated version is so challenging after several googling, testing out the solution from StackOverflow, trials, and errors, reading documentations we eventually made progress with the help of my mentor, co-intern and the whole community member. It's so crazy that when I got it working I said to myself. phew it's not rocket science why can I figure it out sooner than expected I initially proposed that I would be halfway done with the project by now, I guess the reason am not able to achieve some of our goals which are finishing up with the packages for webpack and moving to transition some of the nodejs packages at all is DEBUGGING. Yes DEBUGGING! where you never can predict what the solution is. is the problem coming from Debian? or dependencies of the package you are working on, upstream bug, or dependencies of dependencies of the package you are working on, so many questions to answer. You can't easily find a solution to a bug as it takes time to try out so many guesses more of an educated guess, or even try out all the solutions from stack overflow and still no viable progress. Obviously, you cannot really know about something to set up a plan for unless you get right into it. One way of doing this, if I have to start again is the truly understand how the javascript package work under the hood, how its handles different interaction between packages, some of its dos and don't of transpiling, bundling, testing, e.t.c I guess my unrealistic goals need to be modified because some drawback that was not envisaged popped up and I underestimated the complexity of the tasks, which will be reducing the number of packages to update in transitioning of nodejs from what I planned My major focus for the second half of the internship is to fix bugs and errors I discover, file bug reports for future bugs to seek help from co-maintainer or developers, file issues upstream and close those whose bugs are already resolved for the remaining 12 packages, and ultimately successfully uploading all reverse dependencies. Also diving into transitioning of nodejs16. Thanks for stopping by

23 February 2022

Russ Allbery: Review: Elder Race

Review: Elder Race, by Adrian Tchaikovsky

Publisher:	Tordotcom
Copyright:	November 2021
ISBN:	1-250-76871-3
Format:	Kindle
Pages:	199

(It's a shame that a lot of people will be reading this novella on a black-and-white ebook reader, since the Emmanuel Shiu cover is absolutely spectacular. There's a larger image without the words at the bottom of that article.) When reports arrive at the court about demons deep in the forest that are taking over animals and humans and bending them to their will, the queen doesn't care. It's probably some unknown animal, and regardless, the forest kingdom is a rival anyway. Lynesse Fourth Daughter disagrees vehemently, but she has no power at court. Even apart from her lack of seniority, her love of stories and daring and adventures is a source of endless frustration to her mother. That is why this novella opens with her climbing the mountain path to the Tower of Nyrgoth Elder, the last of the ancient wizards, to seek his help. Nyr Illim Tevitch is an anthropologist second class of Earth's Explorer Corps, part of the second wave of Earth's outward expansion through the galaxy. In the first wave, colonies were seeded on habitable planets, only to be left stranded when Earth's civilization collapsed in an ecological crisis. Nyr was a member of a team of four, sent to make careful and limited contact with one of those lost colonies as part of Earth's second flourishing with more advanced technology. When the team lost contact with Earth, the other three went back while Nyr stayed to keep their field observations going. It's now 291 years of intermittent suspended animation later. Nyr's colleagues never came back, and there have been no messages from Earth. Elder Race is a Prime Directive anthropology story, a subgenre so long-standing that it has its own conventions and variations. Variations of the theme have been written by everyone from Eleanor Arnason to Iain M. Banks (linking to the book I have in mind is arguably a spoiler). Per the dedication, Tchaikovsky's take is based on Gene Wolfe's story "Trip, Trap," which I have not read but whose plot looks very similar. To that story structure, Tchaikovsky brings two major twists. First, Nyr is cut off from his advanced civilization, and has considerable reason to believe that civilization no longer exists. Do noninterference rules still have any meaning if Nyr is stranded and the civilization that made the rules is gone? Second, Nyr has already broken those rules rather spectacularly. More than a hundred years previously, he had ridden with Astresse Regent, a warrior queen and Lynesse's ancestor, to defeat a local warlord who had found control codes for abandoned advanced machinery and was using it as weaponry. In the process, he fell in love and made a rash promise to come to the aid of any of her descendants if he were needed. Lynesse has come to collect on the promise. Elder Race is told in alternating chapters between Nyr and Lynesse's viewpoints: first person for Nyr and tight third person for Lynesse. The core of the story is this doubled perspective, one from a young woman who wants to live in a fantasy novel and one from a deeply depressed anthropologist torn between wanting human contact, wanting to follow the rules of his profession, and wanting to explain to Lynesse that he is not a wizard. Nyr talks himself into helping with another misuse of advanced technology using the same logic he used a hundred years earlier: he's protecting Lynesse's pre-industrial society from interference rather than causing it. But the demons Lynesse wants him to fight are something entirely unexpected. This parallel understanding is a great story structure. What worked less for me was Tchaikovsky's reliance on linguistic barriers to prevent shared understanding. Whenever Nyr tries to explain something, Lynesse hears it in terms of magic and high fantasy, and often exactly backwards from how Nyr intended it. This is where my suspension of disbelief failed me, even though I normally don't have suspension of disbelief problems in SF stories. I was unable to map Lynesse's misunderstandings to any realistic linguistic model. Lynesse's language is highly complex (a realistic development within an isolated population), and Nyr complains about his inability to speak it properly given it's blizzard of complex modifiers. This is entirely believable. What is far less believable is that Lynesse perceives him as fluent in her language, but often saying the precise opposite of what he's trying to say. One chapter in the middle of the book gives Nyr's intended story side-by-side with Lynesse's understanding. This is a brilliant way to show the divide, but I found the translation errors unbelievable. If Nyr is failing that profoundly to communicate his meaning, he should be making more egregious sentence-level errors, occasionally saying something bizarre or entirely nonsensical, referring to a person as an animal or a baby, or otherwise not fluently telling a coherent story that's fundamentally different than the one he thinks he's telling. If you can put that aside, though, this is a fun story. Nyr has serious anxiety and depression made worse by his isolation, and copes by using an implanted device called a Dissociative Cognition System that lets him temporarily turn off his emotions at the cost of letting them snowball. He has a wealth of other augments and implants, including horns, which Lynesse sees as evidence that he's a different species of magical being and which he sees as occasionally irritating field equipment with annoying visual menus. The key to writing a story like this is for both perspectives to be correct given their own assumptions, and to offer insight that the other perspective is missing. I thought the linguistic part of that was unsuccessful, but the rest of it works. One of the best parts of novellas is that they don't wear out their welcome. This is a fun spin on well-trodden ground that tells a complete story in under 200 pages. I wish the ending had been a bit more satisfying and the linguistics had been more believable, but I enjoyed the time I spent in this world. Content warning for some body horror. Rating: 7 out of 10

3 February 2022

Sven Hoexter: Suntime Calculation with Lua and the Great Gift of Open Source

tl;dr I ported a part of the python-suntime library to Lua to use it on OpenWRT and RutOS powered devices. suntime.lua There are those unremarkable things which let you pause for a moment, and realize what a great gift of our time open source software and open knowledge is. At some point in time someone figured out how to calculate the sunrise and sunset time on the current date for your location. Someone else wrote that up and probably again a different person published it on the internet. The Internet Archive preserved a copy of it so I can still link to it. Someone took this algorithm and published a code sample on StackOverflow, which was later on used by the SatAgro guys to create the python-suntime library. Now I could come along, copy the core function of this library, convert it within a few hours - mostly spent learning a bit of Lua, to a working script fulfilling my needs.

14 January 2022

Dirk Eddelbuettel: Rcpp 1.0.8: Updated, Strict Headers

The Rcpp team is thrilled to share the news of the newest release 1.0.8 of Rcpp which hit CRAN today, and has already been uploaded to Debian as well. Windows and macOS builds should appear at CRAN in the next few days. This release continues with the six-months cycle started with release 1.0.5 in July 2020. As a reminder, interim dev or rc releases will alwasys be available in the Rcpp drat repo; this cycle there were once again seven (!!) times two as we also tested the modified header (more below). These rolling release tend to work just as well, and are also fully tested against all reverse-dependencies. Rcpp has become the most popular way of enhancing R with C or C++ code. Right now, around 2478 packages on CRAN depend on Rcpp for making analytical code go faster and further, along with 242 in BioConductor. This release finally brings a change we have worked on quite a bit over the last few months. The idea of enforcing the setting of STRICT_R_HEADERS was prososed years ago in 2016 and again in 2018. But making such a chance against a widely-deployed code base has repurcussions, and we were not ready then. Last April, this was revisited in issue #1158. Over the course of numerous lengthy runs of tests of a changed Rcpp package against (essentially) all reverse-dependencies (i.e. packages which use Rcpp) we identified ninetyfour packages in total which needed a change. We provided either a patch we emailed, or a GitHub pull request, to all ninetyfour. And we are happy to say that eighty cases were resolved via a new CRAN upload, with a seven more having merged the pull request but not yet uploaded. Hence, we could make the case to CRAN (who were always CC ed on the monthly nag emails we sent to maintainers of packages needing a change) that an upload was warranted. And after a brief period for their checks and inspection, our January 11 release of Rcpp 1.0.8 arrived on CRAN on January 13. So with that, a big and heartfelt Thank You! to all eighty maintainers for updating their packages to permit this change at the Rcpp end, to CRAN for the extra checking, and to everybody else who I bugged with the numerous emails and updated to the seemingly never-ending issue #1158. We all got this done, and that is a Good Thing (TM). Other than the aforementioned change which will not automatically set STRICT_R_HEADERS (unless opted out which one can), a number of nice pull request by a number of contributors are included in this release:

I aki generalized use of finalizers for external pointers in #1180
Kevin ensured include paths are always quoted in #1189
Dirk added new headers to allow a more fine-grained choice of Rcpp feature for faster builds in #1191
Travers Ching extended the function signature generator to allow for a default R argument in #1184 and #1187
Dirk extended documentation, removed old example code, updated references and refreshed CI setup in several PRs (see below)

The full list of details follows.

Changes in Rcpp release version 1.0.8 (2022-01-11)

Changes in Rcpp API:

STRICT_R_HEADERS is now enabled by default, see extensive discussion in #1158 closing #898.

A new #define allows default setting of finalizer calls for external pointers (I aki in #1180 closing #1108).

Rcpp:::CxxFlags() now quotes the include path generated, (Kevin in #1189 closing #1188).

New header files Rcpp/Light, Rcpp/Lighter, Rcpp/Lightest and default Rcpp/Rcpp for fine-grained access to features (and compilation time) (Dirk #1191 addressing #1168).

Changes in Rcpp Attributes:

A new option signature allows customization of function signatures (Travers Ching in #1184 and #1187 fixing #1182)

Changes in Rcpp Documentation:

The Rcpp FAQ has a new entry on how not to grow a vector (Dirk in #1167).

Some long-spurious calls to RNGSope have been removed from examples (Dirk in #1173 closing #1172).

DOI reference in the bibtex files have been updated per JSS request (Dirk in #1186).

Changes in Rcpp Deployment:

Some continuous integration components have been updated (Dirk in #1174, #1181, and #1190).

Thanks to my CRANberries, you can also look at a diff to the previous release. Questions, comments etc should go to the rcpp-devel mailing list off the R-Forge page. Bugs reports are welcome at the GitHub issue tracker as well (where one can also search among open or closed issues); questions are also welcome under rcpp tag at StackOverflow which also allows searching among the (currently) 2822 previous questions. If you like this or other open-source work I do, you can sponsor me at GitHub.

This post by Dirk Eddelbuettel originated on his Thinking inside the box blog. Please report excessive re-aggregation in third-party for-profit settings.

31 December 2021

Chris Lamb: Favourite books of 2021: Fiction

In my two most recent posts, I listed the memoirs and biographies and followed this up with the non-fiction I enjoyed the most in 2021. I'll leave my roundup of 'classic' fiction until tomorrow, but today I'll be going over my favourite fiction. Books that just miss the cut here include Kingsley Amis' comic Lucky Jim, Cormac McCarthy's The Road (although see below for McCarthy's Blood Meridian) and the Complete Adventures of Tintin by Herg , the latter forming an inadvertently incisive portrait of the first half of the 20th century. Like ever, there were a handful of books that didn't live up to prior expectations. Despite all of the hype, Emily St. John Mandel's post-pandemic dystopia Station Eleven didn't match her superb The Glass Hotel (one of my favourite books of 2020). The same could be said of John le Carr 's The Spy Who Came in from the Cold, which felt significantly shallower compared to Tinker, Tailor, Soldier, Spy again, a favourite of last year. The strangest book (and most difficult to classify at all) was undoubtedly Patrick S skind's Perfume: The Story of a Murderer, and the non-fiction book I disliked the most was almost-certainly Beartown by Fredrik Bachman. Two other mild disappointments were actually film adaptions. Specifically, the original source for Vertigo by Pierre Boileau and Thomas Narcejac didn't match Alfred Hitchock's 1958 masterpiece, as did James Sallis' Drive which was made into a superb 2011 neon-noir directed by Nicolas Winding Refn. These two films thus defy the usual trend and are 'better than the book', but that's a post for another day.

A Wizard of Earthsea (1971) Ursula K. Le Guin How did it come to be that Harry Potter is the publishing sensation of the century, yet Ursula K. Le Guin's Earthsea is only a popular cult novel? Indeed, the comparisons and unintentional intertextuality with Harry Potter are entirely unavoidable when reading this book, and, in almost every respect, Ursula K. Le Guin's universe comes out the victor. In particular, the wizarding world that Le Guin portrays feels a lot more generous and humble than the class-ridden world of Hogwarts School of Witchcraft and Wizardry. Just to take one example from many, in Earthsea, magic turns out to be nurtured in a bottom-up manner within small village communities, in almost complete contrast to J. K. Rowling's concept of benevolent government departments and NGOs-like institutions, which now seems a far too New Labour for me. Indeed, imagine an entire world imbued with the kindly benevolence of Dumbledore, and you've got some of the moral palette of Earthsea. The gently moralising tone that runs through A Wizard of Earthsea may put some people off:

Vetch had been three years at the School and soon would be made Sorcerer; he thought no more of performing the lesser arts of magic than a bird thinks of flying. Yet a greater, unlearned skill he possessed, which was the art of kindness.

Still, these parables aimed directly at the reader are fairly rare, and, for me, remain on the right side of being mawkish or hectoring. I'm thus looking forward to reading the next two books in the series soon.

Blood Meridian (1985) Cormac McCarthy Blood Meridian follows a band of American bounty hunters who are roaming the Mexican-American borderlands in the late 1840s. Far from being remotely swashbuckling, though, the group are collecting scalps for money and killing anyone who crosses their path. It is the most unsparing treatment of American genocide and moral depravity I have ever come across, an anti-Western that flouts every convention of the genre. Blood Meridian thus has a family resemblance to that other great anti-Western, Once Upon a Time in the West: after making a number of gun-toting films that venerate the American West (ie. his Dollars Trilogy), Sergio Leone turned his cynical eye to the western. Yet my previous paragraph actually euphemises just how violent Blood Meridian is. Indeed, I would need to be a much better writer (indeed, perhaps McCarthy himself) to adequately 0utline the tone of this book. In a certain sense, it's less than you read this book in a conventional sense, but rather that you are forced to witness successive chapters of grotesque violence... all occurring for no obvious reason. It is often said that books 'subvert' a genre and, indeed, I implied as such above. But the term subvert implies a kind of Puck-like mischievousness, or brings to mind court jesters licensed to poke fun at the courtiers. By contrast, however, Blood Meridian isn't funny in the slightest. There isn't animal cruelty per se, but rather wanton negligence of another kind entirely. In fact, recalling a particular passage involving an injured horse makes me feel physically ill. McCarthy's prose is at once both baroque in its language and thrifty in its presentation. As Philip Connors wrote back in 2007, McCarthy has spent forty years writing as if he were trying to expand the Old Testament, and learning that McCarthy grew up around the Church therefore came as no real surprise. As an example of his textual frugality, I often looked for greater precision in the text, finding myself asking whether who a particular 'he' is, or to which side of a fight some two men belonged to. Yet we must always remember that there is no precision to found in a gunfight, so this infidelity is turned into a virtue. It's not that these are fair fights anyway, or even 'murder': Blood Meridian is just slaughter; pure butchery. Murder is a gross understatement for what this book is, and at many points we are grateful that McCarthy spares us precision. At others, however, we can be thankful for his exactitude. There is no ambiguity regarding the morality of the puppy-drowning Judge, for example: a Colonel Kurtz who has been given free license over the entire American south. There is, thank God, no danger of Hollywood mythologising him into a badass hero. Indeed, we must all be thankful that it is impossible to film this ultra-violent book... Indeed, the broader idea of 'adapting' anything to this world is, beyond sick. An absolutely brutal read; I cannot recommend it highly enough.

Bodies of Light (2014) Sarah Moss Bodies of Light is a 2014 book by Glasgow-born Sarah Moss on the stirrings of women's suffrage within an arty clique in nineteenth-century England. Set in the intellectually smoggy cities of Manchester and London, this poignant book follows the studiously intelligent Alethia 'Ally' Moberly who is struggling to gain the acceptance of herself, her mother and the General Medical Council. You can read my full review from July.

House of Leaves (2000) Mark Z. Danielewski House of Leaves is a remarkably difficult book to explain. Although the plot refers to a fictional documentary about a family whose house is somehow larger on the inside than the outside, this quotidian horror premise doesn't explain the complex meta-commentary that Danielewski adds on top. For instance, the book contains a large number of pseudo-academic footnotes (many of which contain footnotes themselves), with references to scholarly papers, books, films and other articles. Most of these references are obviously fictional, but it's the kind of book where the joke is that some of them are not. The format, structure and typography of the book is highly unconventional too, with extremely unusual page layouts and styles. It's the sort of book and idea that should be a tired gimmick but somehow isn't. This is particularly so when you realise it seems specifically designed to create a fandom around it and to manufacturer its own 'cult' status, something that should be extremely tedious. But not only does this not happen, House of Leaves seems to have survived through two exhausting decades of found footage: The Blair Witch Project and Paranormal Activity are, to an admittedly lesser degree, doing much of the same thing as House of Leaves. House of Leaves might have its origins in Nabokov's Pale Fire or even Derrida's Glas, but it seems to have more in common with the claustrophobic horror of Cube (1997). And like all of these works, House of Leaves book has an extremely strange effect on the reader or viewer, something quite unlike reading a conventional book. It wasn't so much what I got out of the book itself, but how it added a glow to everything else I read, watched or saw at the time. An experience.

Milkman (2018) Anna Burns This quietly dazzling novel from Irish author Anna Burns is full of intellectual whimsy and oddball incident. Incongruously set in 1970s Belfast during The Irish Troubles, Milkman's 18-year-old narrator (known only as middle sister ), is the kind of dreamer who walks down the street with a Victorian-era novel in her hand. It's usually an error for a book that specifically mention other books, if only because inviting comparisons to great novels is grossly ill-advised. But it is a credit to Burns' writing that the references here actually add to the text and don't feel like they are a kind of literary paint by numbers. Our humble narrator has a boyfriend of sorts, but the figure who looms the largest in her life is a creepy milkman an older, married man who's deeply integrated in the paramilitary tribalism. And when gossip about the narrator and the milkman surfaces, the milkman beings to invade her life to a suffocating degree. Yet this milkman is not even a milkman at all. Indeed, it's precisely this kind of oblique irony that runs through this daring but darkly compelling book.

The First Fifteen Lives of Harry August (2014) Claire North Harry August is born, lives a relatively unremarkable life and finally dies a relatively unremarkable death. Not worth writing a novel about, I suppose. But then Harry finds himself born again in the very same circumstances, and as he grows from infancy into childhood again, he starts to remember his previous lives. This loop naturally drives Harry insane at first, but after finding that suicide doesn't stop the quasi-reincarnation, he becomes somewhat acclimatised to his fate. He prospers much better at school the next time around and is ultimately able to make better decisions about his life, especially when he just happens to know how to stay out of trouble during the Second World War. Yet what caught my attention in this 'soft' sci-fi book was not necessarily the book's core idea but rather the way its connotations were so intelligently thought through. Just like in a musical theme and varations, the success of any concept-driven book is far more a product of how the implications of the key idea are played out than how clever the central idea was to begin with. Otherwise, you just have another neat Borges short story: satisfying, to be sure, but in a narrower way. From her relatively simple premise, for example, North has divined that if there was a community of people who could remember their past lives, this would actually allow messages and knowledge to be passed backwards and forwards in time. Ah, of course! Indeed, this very mechanism drives the plot: news comes back from the future that the progress of history is being interfered with, and, because of this, the end of the world is slowly coming. Through the lives that follow, Harry sets out to find out who is passing on technology before its time, and work out how to stop them. With its gently-moralising romp through the salient historical touchpoints of the twentieth century, I sometimes got a whiff of Forrest Gump. But it must be stressed that this book is far less certain of its 'right-on' liberal credentials than Robert Zemeckis' badly-aged film. And whilst we're on the topic of other media, if you liked the underlying conceit behind Stuart Turton's The Seven Deaths of Evelyn Hardcastle yet didn't enjoy the 'variations' of that particular tale, then I'd definitely give The First Fifteen Lives a try. At the very least, 15 is bigger than 7. More seriously, though, The First Fifteen Lives appears to reflect anxieties about technology, particularly around modern technological accelerationism. At no point does it seriously suggest that if we could somehow possess the technology from a decade in the future then our lives would be improved in any meaningful way. Indeed, precisely the opposite is invariably implied. To me, at least, homo sapiens often seems to be merely marking time until we can blow each other up and destroying the climate whilst sleepwalking into some crisis that might precipitate a thermonuclear genocide sometimes seems to be built into our DNA. In an era of cli-fi fiction and our non-fiction newspaper headlines, to label North's insight as 'prescience' might perhaps be overstating it, but perhaps that is the point: this destructive and negative streak is universal to all periods of our violent, insecure species.

The Goldfinch (2013) Donna Tartt After Breaking Bad, the second biggest runaway success of 2014 was probably Donna Tartt's doorstop of a novel, The Goldfinch. Yet upon its release and popular reception, it got a significant number of bad reviews in the literary press with, of course, an equal number of predictable think pieces claiming this was sour grapes on the part of the cognoscenti. Ah, to be in 2014 again, when our arguments were so much more trivial. For the uninitiated, The Goldfinch is a sprawling bildungsroman that centres on Theo Decker, a 13-year-old whose world is turned upside down when a terrorist bomb goes off whilst visiting the Metropolitan Museum of Art, killing his mother among other bystanders. Perhaps more importantly, he makes off with a painting in order to fulfil a promise to a dying old man: Carel Fabritius' 1654 masterpiece The Goldfinch. For the next 14 years (and almost 800 pages), the painting becomes the only connection to his lost mother as he's flung, almost entirely rudderless, around the Western world, encountering an array of eccentric characters. Whatever the critics claimed, Tartt's near-perfect evocation of scenes, from the everyday to the unimaginable, is difficult to summarise. I wouldn't label it 'cinematic' due to her evocation of the interiority of the characters. Take, for example: Even the suggestion that my father had close friends conveyed a misunderstanding of his personality that I didn't know how to respond it's precisely this kind of relatable inner subjectivity that cannot be easily conveyed by film, likely is one of the main reasons why the 2019 film adaptation was such a damp squib. Tartt's writing is definitely not 'impressionistic' either: there are many near-perfect evocations of scenes, even ones we hope we cannot recognise from real life. In particular, some of the drug-taking scenes feel so credibly authentic that I sometimes worried about the author herself. Almost eight months on from first reading this novel, what I remember most was what a joy this was to read. I do worry that it won't stand up to a more critical re-reading (the character named Xandra even sounds like the pharmaceuticals she is taking), but I think I'll always treasure the first days I spent with this often-beautiful novel.

Beyond Black (2005) Hilary Mantel Published about five years before the hyperfamous Wolf Hall (2004), Hilary Mantel's Beyond Black is a deeply disturbing book about spiritualism and the nature of Hell, somewhat incongruously set in modern-day England. Alison Harte is a middle-aged physic medium who works in the various towns of the London orbital motorway. She is accompanied by her stuffy assistant, Colette, and her spirit guide, Morris, who is invisible to everyone but Alison. However, this is no gentle and musk-smelling world of the clairvoyant and mystic, for Alison is plagued by spirits from her past who infiltrate her physical world, becoming stronger and nastier every day. Alison's smiling and rotund persona thus conceals a truly desperate woman: she knows beyond doubt the terrors of the next life, yet must studiously conceal them from her credulous clients. Beyond Black would be worth reading for its dark atmosphere alone, but it offers much more than a chilling and creepy tale. Indeed, it is extraordinarily observant as well as unsettlingly funny about a particular tranche of British middle-class life. Still, the book's unnerving nature that sticks in the mind, and reading it noticeably changed my mood for days afterwards, and not necessarily for the best.

The Wall (2019) John Lanchester The Wall tells the story of a young man called Kavanagh, one of the thousands of Defenders standing guard around a solid fortress that envelopes the British Isles. A national service of sorts, it is Kavanagh's job to stop the so-called Others getting in. Lanchester is frank about what his wall provides to those who stand guard: the Defenders of the Wall are conscripted for two years on the Wall, with no exceptions, giving everyone in society a life plan and a story. But whilst The Wall is ostensibly about a physical wall, it works even better as a story about the walls in our mind. In fact, the book blends together of some of the most important issues of our time: climate change, increasing isolation, Brexit and other widening societal divisions. If you liked P. D. James' The Children of Men you'll undoubtedly recognise much of the same intellectual atmosphere, although the sterility of John Lanchester's dystopia is definitely figurative and textual rather than literal. Despite the final chapters perhaps not living up to the world-building of the opening, The Wall features a taut and engrossing narrative, and it undoubtedly warrants even the most cursory glance at its symbolism. I've yet to read something by Lanchester I haven't enjoyed (even his short essay on cheating in sports, for example) and will be definitely reading more from him in 2022.

The Only Story (2018) Julian Barnes The Only Story is the story of Paul, a 19-year-old boy who falls in love with 42-year-old Susan, a married woman with two daughters who are about Paul's age. The book begins with how Paul meets Susan in happy (albeit complicated) circumstances, but as the story unfolds, the novel becomes significantly more tragic and moving. Whilst the story begins from the first-person perspective, midway through the book it shifts into the second person, and, later, into the third as well. Both of these narrative changes suggested to me an attempt on the part of Paul the narrator (if not Barnes himself), to distance himself emotionally from the events taking place. This effect is a lot more subtle than it sounds, however: far more prominent and devastating is the underlying and deeply moving story about the relationship ends up. Throughout this touching book, Barnes uses his mastery of language and observation to avoid the saccharine and the maudlin, and ends up with a heart-wrenching and emotive narrative. Without a doubt, this is the saddest book I read this year.

5 December 2021

Reproducible Builds: Reproducible Builds in November 2021

Welcome to the November 2021 report from the Reproducible Builds project. As a quick recap, whilst anyone may inspect the source code of free software for malicious flaws, almost all software is distributed to end users as pre-compiled binaries. The motivation behind the reproducible builds effort is therefore to ensure no flaws have been introduced during this compilation process by promising identical results are always generated from a given source, thus allowing multiple third-parties to come to a consensus on whether a build was compromised. If you are interested in contributing to our project, please visit our Contribute page on our website.

On November 6th, Vagrant Cascadian presented at this year s edition of the SeaGL conference, giving a talk titled Debugging Reproducible Builds One Day at a Time:

I ll explore how I go about identifying issues to work on, learn more about the specific issues, recreate the problem locally, isolate the potential causes, dissect the problem into identifiable parts, and adapt the packaging and/or source code to fix the issues.

A video recording of the talk is available on archive.org.

Fedora Magazine published a post written by Zbigniew J drzejewski-Szmek about how to Use Diffoscope in packager workflows, specifically around ensuring that new versions of a package do not introduce breaking changes:

In the role of a packager, updating packages is a recurring task. For some projects, a packager is involved in upstream maintenance, or well written release notes make it easy to figure out what changed between the releases. This isn t always the case, for instance with some small project maintained by one or two people somewhere on GitHub, and it can be useful to verify what exactly changed. Diffoscope can help determine the changes between package releases. [ ]

kpcyrd announced the release of rebuilderd version 0.16.3 on our mailing list this month, adding support for builds to generate multiple artifacts at once.

Lastly, we held another IRC meeting on November 30th. As mentioned in previous reports, due to the global events throughout 2020 etc. there will be no in-person summit event this year.

diffoscope diffoscope is our in-depth and content-aware diff utility. Not only can it locate and diagnose reproducibility issues, it can provide human-readable diffs from many kinds of binary formats. This month, Chris Lamb made the following changes, including preparing and uploading versions `190`, `191`, `192`, `193` and `194` to Debian:

New features:

Continue loading a `.changes` file even if the referenced files do not exist, but include a comment in the returned diff. [ ]

Log the reason if we cannot load a Debian `.changes` file. [ ]

Bug fixes:

Detect XML files as XML files if `file(1)` claims if they are XML files or if they are named `.xml`. (`#999438`)

Don t duplicate file lists at each directory level. (`#989192`)

Don t raise a traceback when comparing nested directories with non-directories. [ ]

Re-enable `test_android_manifest`. [ ]

Don t reject Debian `.changes` files if they contain non-printable characters. [ ]

Codebase improvements:

Avoid aliasing variables if we aren t going to use them. [ ]

Use `isinstance` over `type`. [ ]

Drop a number of unused imports. [ ]

Update a bunch of `%`-style string interpolations into f-strings or `str.format`. [ ]

When pretty-printing JSON, mark the difference as being reformatted, additionally avoiding including the full path. [ ]

Import `itertools` top-level module directly. [ ]

Chris Lamb also made an update to the command-line client to trydiffoscope, a web-based version of the diffoscope in-depth and content-aware diff utility, specifically only waiting for 2 minutes for `try.diffoscope.org` to respond in tests. (#998360) In addition Brandon Maier corrected an issue where parts of large diffs were missing from the output [ ], Zbigniew J drzejewski-Szmek fixed some logic in the `assert_diff_startswith` method [ ] and Mattia Rizzolo updated the packaging metadata to denote that we support both Python 3.9 and 3.10 [ ] as well as a number of warning-related changes[ ][ ]. Vagrant Cascadian also updated the diffoscope package in GNU Guix [ ][ ].

Distribution work In Debian, Roland Clobus updated the wiki page documenting Debian reproducible Live images to mention some new bug reports and also posted an in-depth status update to our mailing list. In addition, 90 reviews of Debian packages were added, 18 were updated and 23 were removed this month adding to our knowledge about identified issues. Chris Lamb identified a new toolchain issue, absolute_path_in_cmake_file_generated_by_meson.
Work has begun on classifying reproducibility issues in packages within the Arch Linux distribution. Similar to the analogous effort within Debian (outlined above), package information is listed in a human-readable `packages.yml` YAML file and a sibling `README.md` file shows how to classify packages too. Finally, Bernhard M. Wiedemann posted his monthly reproducible builds status report for openSUSE and Vagrant Cascadian updated a link on our website to link to the GNU Guix reproducibility testing overview [ ].

Software development The Reproducible Builds project detects, dissects and attempts to fix as many currently-unreproducible packages as possible. We endeavour to send all of our patches upstream where appropriate. This month, we wrote a large number of such patches, including:

Bernhard M. Wiedemann:

`ck` (build failure in single-CPU machine)

`libfabric` (date-related issue)

`python-dukpy-kovidgoyal` (filesystem ordering)

`python-thriftpy2` (build failure in the far future)

`smplayer` (date)

Chris Lamb:

#998312 filed against `ibus-input-pad`.

#999848 filed against `node-cssstyle`.

#999866 filed against `liboqs`.

#1000326 filed against `xrstools`.

#1000327 filed against `meson` (forwarded).

#1000401 filed against `golang-github-go-git-go-git`.

#1000531 filed against `sphinxcontrib-applehelp`.

#1000532 filed against `sphinxcontrib-jsmath`.

#1000533 filed against `sphinxcontrib-htmlhelp`.

#1000535 filed against `sphinxcontrib-restbuilder`.

#1000769 filed against `node-marked`.

#1000770 filed against `perfect-scrollbar`.

Roland Clobus:

#1000674 and #1000685 filed against `dictionaries-common` (randomness in Ispell dictionaries)

Simon McVittie:

#999552 filed against `pcs`.

Vagrant Cascadian:

#998420 filed against `minia`.

#1000768 filed against `krb5`.

#1000836 filed against `libu2f-host`.

#1000839 filed against `gutenprint`.

#1000893 filed against `bind9`.

#1000897 filed against `lift`.

#1000921 filed against `syncevolution`.

#1000944 filed against `apbs`.

#1000945 filed against `binutils-riscv64-unknown-elf`.

#1000946 filed against `gcc-riscv64-unknown-elf`.

`opensbi`, which later led to a better patch on their mailing list.

Elsewhere, in software development, Jonas Witschel updated strip-nondeterminism, our tool to remove specific non-deterministic results from a completed build so that it did not fail on JAR archives containing invalid members with a .jar extension [ ]. This change was later uploaded to Debian by Chris Lamb. reprotest is the Reproducible Build s project end-user tool to build the same source code twice in widely different environments and checking whether the binaries produced by the builds have any differences. This month, Mattia Rizzolo overhauled the Debian packaging [ ][ ][ ] and fixed a bug surrounding suffixes in the Debian package version [ ], whilst Stefano Rivera fixed an issue where the package tests were broken after the removal of diffoscope from the package s strict dependencies [ ].

Testing framework The Reproducible Builds project runs a testing framework at tests.reproducible-builds.org, to check packages and other artifacts for reproducibility. This month, the following changes were made:

Holger Levsen:

Document the progress in setting up `snapshot.reproducible-builds.org`. [ ]

Add the packages required for debian-snapshot. [ ]

Make the `dstat` package available on all Debian based systems. [ ]

Mark `virt32b-armhf` and `virt64b-armhf` as down. [ ]

Jochen Sprickerhof:

Add SSH authentication key and enable access to the `osuosl168-amd64` node. [ ][ ]

Mattia Rizzolo:

Revert reproducible Debian: mark virt(32 64)b-armhf as down - restored. [ ]

Roland Clobus (Debian live image generation):

Rename `sid` internally to `unstable` until an issue in the snapshot system is resolved. [ ]

Extend testing to include Debian bookworm too.. [ ]

Automatically create the Jenkins view to display jobs related to building the Live images. [ ]

Vagrant Cascadian:

Add a Debian package set group for the packages and tools maintained by the Reproducible Builds maintainers themselves. [ ]

If you are interested in contributing to the Reproducible Builds project, please visit our Contribute page on our website. However, you can get in touch with us via:

IRC: `#reproducible-builds` on `irc.oftc.net`.

Twitter: @ReproBuilds

Mailing list: `rb-general@lists.reproducible-builds.org`

4 November 2021

Sandro Tosi: Python: send emails with embedded images

to send emails with images you need to use MIMEMultipart, but the basic approach:

import smtplib

from email.mime.multipart import MIMEMultipart
from email.mime.image import MIMEImage

msg = MIMEMultipart('alternative')
msg['Subject'] = "subject"
msg['From'] = from_addr
msg['To'] = to_addr

part = MIMEImage(open('/path/to/image', 'rb').read())

s = smtplib.SMTP('localhost')
s.sendmail(from_addr, to_addr, msg.as_string())
s.quit()

will produce an email with empty body and the image as an attachment.The better way, ie to have the image as part of the body of the email, requires to write an HTML body that refers to that image:

import smtplib

from email.mime.multipart import MIMEMultipart
from email.mime.text import MIMEText
from email.mime.image import MIMEImage

msg = MIMEMultipart('alternative')
msg['Subject'] = "subject
msg['From'] = from_addr
msg['To'] = to_addr

text = MIMEText('<img src="cid:image1">', 'html')
msg.attach(text)

image = MIMEImage(open('/path/to/image', 'rb').read())

# Define the image's ID as referenced in the HTML body above
image.add_header('Content-ID', '<image1>')
msg.attach(image)

s = smtplib.SMTP('localhost')
s.sendmail(from_addr, to_addr, msg.as_string())
s.quit()

The trick is to define an image with a specific Content-ID and make that the only item in an HTML body: now you have an email with contains that specific image as the only content of the body, embedded in it.

Bonus point: if you want to take a snapshot of a webpage (which is kinda the reason i needed the code above) i found it extremely useful to use the Google PageSpeed Insights API; a good description on how to use that API with Python is available at this StackOverflow answer.

UPDATE (2020-12-26): I was made aware via email that some mail providers may not display images inline when the Content-ID value is too short (say, for example, Content-ID: 1). A solution that seems to work on most of the providers is using a sequence of random chars prefixed with a dot and suffixed with a valid mail domain.

Next.

Previous.